Microbiome Data Science - Phylogenetic Tree, Bacterial Growth Rate and Biosynthetic Gene Clusters

Time：Mon., 10:00am-12:00pm, Jan.30,2023

Venue：Online : Zoom ID: 4552601552 ；PW: YMSC Offline : Lecture Hall, Floor 3, Jin Chun Yuan West Bldg.

Speaker：Hongzhe Li 李洪哲

Speaker

Dr. Hongzhe Li is Perelman Professor of Biostatistics, Epidemiology and Informatics at the Perelman School of Medicine at the University of Pennsylvania. He is Vice Chair of Research Integration, Director of Center of Statistics in Big Data and former Chair of the Graduate Program in Biostatistic at Penn. He is also a Professor of Statistics and Data Science at the Wharton School. Dr. Li has been elected as a Fellow of the American Statistical Association (ASA), a Fellow of the Institute of Mathematical Statistics (IMS) and a Fellow of American Association for the Advancement of Science (AAAS). Dr. Li served on the Board of Scientific Counselors of the National Cancer Institute of NIH and regularly serves on various NIH study sections. He served as Chair of the Section on Statistics in Genomics and Genetics of the ASA and Co-Editor-in-Chief of Statistics in Biosciences. Dr. Li’s research focuses on developing statistical and computational methods for analysis of large-scale genetic, genomics and metagenomics data and theory on high dimensional statistics. He has over 240 published papers, including papers in Science, Nature, Nature Genetics, Nature Methods, Nature Microbiology, Science Translational Medicine, Cell Host & Microbe, JASA, JRSS, Biometrika, Biometrics and Annals of Applied Statistics etc. He has trained over 50 PhD students and postdoctoral fellows.

Abstract

The gut microbiome plays an important role in maintenance of human health. High-throughput shotgun metagenomic sequencing of a large set of samples provides an important tool to interrogate the gut microbiome. Besides providing footprints of taxonomic community composition and genes, these data can be further explored to study the bacterial growth rate and metabolic potentials via generation of small molecules and secondary metabolites. Everything from microbiome diagnosis to microbiome-based therapy will rely on vast amounts of data analysis. In this talk, I will present several computational and statistical methods for analysis of data measured on phylogenetic tree and methods for estimating bacterial growth rate for metagenome-assembled genomes (MAGs). I will also present a deep learning algorithm for predicting all biosynthetic gene clusters (BGCs) in the bacterial genomes. The key statistical and computational tools used include Wasserstein distance estimation, optimal permutation recovery based on low-rank matrix projection and a LSTM deep learning method to improve prediction of BGCs. I will demonstrate the application of these methods using several ongoing microbiome studies of inflammatory bowel disease at the University of Pennsylvania.

DATEJanuary 30, 2023

Related News

0
How data science and machine learning interpret genomic data and contribute to personalized medicine
Abstract：High‐throughput genomics yields vast amounts of data for personalized medicine and other health-related discoveries. For instance, genome‐wide association studies (GWAS), which involves tens of thousands to millions of subjects, have linked thousands of genetic changes or variants with human diseases. Accumulating these variants across a subjects' entire genome can help predict thei...
1
Machine intelligence and network science for complex systems big data analysis
Speaker Dr. Cannistraci is a theoretical engineer and computational innovator. He is a Professor in the Tsinghua Laboratory of Brain and Intelligence (THBI) and an adjunct professor in the Department of Computer Science and in the Department of Biomedical Engineering at Tsinghua University. He directs the Center for Complex Network Intelligence (CCNI) in THBI, which seeks to create pioneering a...