清华主页 EN
导航菜单

Microbiome Data Science - Phylogenetic Tree, Bacterial Growth Rate and Biosynthetic Gene Clusters

来源: 01-30

时间:Mon., 10:00am-12:00pm, Jan.30,2023

地点:Online : Zoom ID: 4552601552 ;PW: YMSC Offline : Lecture Hall, Floor 3, Jin Chun Yuan West Bldg.

主讲人:Hongzhe Li 李洪哲

Speaker

Dr. Hongzhe Li is Perelman Professor of Biostatistics, Epidemiology and Informatics at the Perelman School of Medicine at the University of Pennsylvania. He is Vice Chair of Research Integration, Director of Center of Statistics in Big Data and former Chair of the Graduate Program in Biostatistic at Penn. He is also a Professor of Statistics and Data Science at the Wharton School. Dr. Li has been elected as a Fellow of the American Statistical Association (ASA), a Fellow of the Institute of Mathematical Statistics (IMS) and a Fellow of American Association for the Advancement of Science (AAAS). Dr. Li served on the Board of Scientific Counselors of the National Cancer Institute of NIH and regularly serves on various NIH study sections. He served as Chair of the Section on Statistics in Genomics and Genetics of the ASA and Co-Editor-in-Chief of Statistics in Biosciences. Dr. Li’s research focuses on developing statistical and computational methods for analysis of large-scale genetic, genomics and metagenomics data and theory on high dimensional statistics. He has over 240 published papers, including papers in Science, Nature, Nature Genetics, Nature Methods, Nature Microbiology, Science Translational Medicine, Cell Host & Microbe, JASA, JRSS, Biometrika, Biometrics and Annals of Applied Statistics etc. He has trained over 50 PhD students and postdoctoral fellows.


Abstract

The gut microbiome plays an important role in maintenance of human health. High-throughput shotgun metagenomic sequencing of a large set of samples provides an important tool to interrogate the gut microbiome. Besides providing footprints of taxonomic community composition and genes, these data can be further explored to study the bacterial growth rate and metabolic potentials via generation of small molecules and secondary metabolites. Everything from microbiome diagnosis to microbiome-based therapy will rely on vast amounts of data analysis. In this talk, I will present several computational and statistical methods for analysis of data measured on phylogenetic tree and methods for estimating bacterial growth rate for metagenome-assembled genomes (MAGs). I will also present a deep learning algorithm for predicting all biosynthetic gene clusters (BGCs) in the bacterial genomes. The key statistical and computational tools used include Wasserstein distance estimation, optimal permutation recovery based on low-rank matrix projection and a LSTM deep learning method to improve prediction of BGCs. I will demonstrate the application of these methods using several ongoing microbiome studies of inflammatory bowel disease at the University of Pennsylvania.

返回顶部
相关文章
  • Topological Approaches for Data Science I

    Record: YesLevel: GraduateLanguage: ChinesePrerequisiteAlgebraic TopologyAbstractTopological data analysis is a new-born research area that explores topological approaches in data science, where persistent homology has been proved as an effective mathematical tool in data analytics with various successful applications. This course will discuss the mathematical foundations of (higher) topologica...

  • How data science and machine learning interpret genomic data and contribute to personalized medicine

    Abstract:High‐throughput genomics yields vast amounts of data for personalized medicine and other health-related discoveries. For instance, genome‐wide association studies (GWAS), which involves tens of thousands to millions of subjects, have linked thousands of genetic changes or variants with human diseases. Accumulating these variants across a subjects' entire genome can help predict thei...