Academics

All-in-One Toolkit for Biobank-Scale Whole-Genome Sequencing Data Management and Analysis

Time:Mon., 14:00-15:00, Nov. 10, 2025

Venue:C548, Shuangqing Complex Building A

Organizer:Yunan Wu

Speaker:Zilin Li

Statistical Seminar


Speaker

Zilin Li 李子林

东北师范大学数学与统计学院


Organizer

Yunan Wu 吴宇楠 (YMSC)


Time

Mon., 14:00-15:00, Nov. 10, 2025


Venue

C548, Shuangqing Complex Building A


All-in-One Toolkit for Biobank-Scale Whole-Genome Sequencing Data Management and Analysis

Biobank-scale Whole-Genome Sequencing (WGS) studies are increasingly pivotal in unraveling the genetic bases of diverse health outcomes. However, managing and analyzing these datasets’ sheer volume and complexity presents significant challenges. We propose vcf2agds, an all-in-one toolkit that efficiently converts WGS data from Variant Call Format (VCF) format to the annotated Genomic Data Structure (aGDS) format, significantly reducing data size while supporting seamless genomic and functional data integration for comprehensive genetic analyses. Additionally, STAARpipeline equipped with the aGDS files enabled scalable, comprehensive and functionally informed WGS analysis, facilitating the detection of common and rare coding and noncoding phenotype-genotype associations. We applied the STAARpipeline to analyze Alzheimer disease (AD) in 459,216 samples from the UK Biobank. All analyses scale well in computation time and memory. We discover several potentially new significant associations with AD. As WGS datasets continue to expand in size and complexity, our proposed tools will be increasingly useful for unlocking the full potential of genomic research.


About the speaker

李子林教授本科与博士毕业于清华大学数学科学系,师从美国国家科学院与医学院两院院士林希虹院士,主要研究方向为高维数据中的统计方法理论和统计遗传学。历任印第安纳大学医学院生物统计与健康数据科学系助理教授,哈佛大学生物统计系博士后、副研究员和研究员,现任东北师范大学数学与统计学院教授。2023年当选为国际统计学会(International Statistical Institute)推选会员(Elected Member)。主要研究方向为高维数据中的统计方法理论和统计遗传学。相关研究成果以第一作者或通讯作者在Journal of AmericanStatistical Association、 Nature Methods和Nature Genetics等国际学术期刊发表。

DATENovember 8, 2025
SHARE
Related News
    • 0

      An introduction to Quantum Topological Data Analysis

      Quantum Scientific Computation and Quantum Artificial IntelligenceOrganizer:Jin-Peng Liu 刘锦鹏Speaker:Junkai WangTime:Thur., 13:30-15:00, Jan. 9, 2025Venue:B626, Shuangqing Complex Building ATitle: An introduction to Quantum Topological Data AnalysisAbstract:As the last talk this semester, we will delve into the topic of "Quantum Topological Data Analysis" (QTDA), a nascent quantum-accele...

    • 1

      Game Theory Seminar | Modeling Microbial Networks: Unraveling Gut Ecosystem Dynamics for Disease Management

      AbstractThe gut microbiota is vital for digestive health and immune function. However, current research tends to focus on individual microbial strains, missing the broader understanding of the entire ecosystem. Constructing microbial networks is key to understanding how these communities form, but existing methods have limitations, prompting the need for new approaches. By merging concepts from...