Academics

Homogeneity pursuit in ranking inferences based on pairwise comparison data

Time:Fri., 14:00-15:00, March 15, 2024

Venue:C654, Shuangqing Complex Building A 清华大学双清综合楼A座; Zoom Meeting ID: 271 534 5558 Passcode: YMSC

Speaker:Yuxin Tao 陶宇心 Tsinghua University

Abstract

The Bradley-Terry-Luce (BTL) model is one of the most celebrated models for ranking inferences based on pairwise comparison data, which associates individuals with latent preference scores and produces ranks. An important question that arises is the uncertainty quantification for ranks. It is natural to think that ranks for two individuals are not trustworthy if there is only a subtle difference in their preference scores. In this paper, we explore the homogeneity of scores in the BTL model, which assumes that individuals cluster into groups with the same preference scores. We introduce the clustering algorithm in regression via data-driven segmentation (CARDS) penalty into the likelihood function, which can rigorously and automatically separate parameters and uncover group structure. Statistical properties of two versions of CARDS are analyzed. As a result, we achieve a faster convergence rate and sharper confidence intervals for the maximum likelihood estimation of preference scores, providing insight into the power of exploring low-dimensional structure in a high-dimensional setting. We analyze real data examples, including sports and journal ranking, to highlight the improved prediction performance and interpretation ability of our method.


About the speaker

Yuxin Tao is a fifth-year Ph.D. student in the Center for Statistical Science at Tsinghua University, advised by Professor Dong Li. Yuxin visited the Department of Statistics at Harvard University from 2022 to 2023, mentored by Professor Tracy Ke. Yuxin's primary research interests include financial econometrics, network analysis, ranking inference and topic modeling. Yuxin also conducts interdisciplinary research with experts in ecology and epidemiology. As a Ph.D. candidate, Yuxin's research has been published in the Journal of Econometrics, Statistica Sinica, and Proceedings of the National Academy of Sciences.

DATEMarch 15, 2024
SHARE
Related News
    • 0

      How data science and machine learning interpret genomic data and contribute to personalized medicine

      Abstract:High‐throughput genomics yields vast amounts of data for personalized medicine and other health-related discoveries. For instance, genome‐wide association studies (GWAS), which involves tens of thousands to millions of subjects, have linked thousands of genetic changes or variants with human diseases. Accumulating these variants across a subjects' entire genome can help predict thei...

    • 1

      0ptimizing the Premerger Notification Rule.Empirical Analysis Based on Micro-Data of China

      AbstractWe leverage a historical dataset from the China Anti-monopoly Bureau to improve the premerger notification rule. We simulate the government's decision-making process and propose the Receiver Operating Characteristic curve to optimize it. Our focus is on balancing the total cost of inappropriately low thresholds or convoluted criterion (Type I error) and undue high thresholds or singular...