Academics

How data science and machine learning interpret genomic data and contribute to personalized medicine

Time:Tues., 16:30-17:30, July 9, 2024

Venue:Tsinghua University West Lecture Hall 清华大学西阶梯教室 Online: Zoom Meeting ID: 455 260 1552 Passcode: YMSC

Speaker:Kathryn Roeder (Carnegie Mellon University)

Abstract:

High‐throughput genomics yields vast amounts of data for personalized medicine and other health-related discoveries. For instance, genome‐wide association studies (GWAS), which involves tens of thousands to millions of subjects, have linked thousands of genetic changes or variants with human diseases. Accumulating these variants across a subjects' entire genome can help predict their risk for various diseases and these findings have already contributed in some instances to improved clinical treatment. However, even with the vast amount of information available, predictive power is typically weak using standard analytical techniques. Breakthroughs in the near future are anticipated using machine learning and AI techniques. On another front, CRISPR, a genetic engineering marvel, promises breathtaking potential for treatments of cancer and other genetic defects. To realize these benefits, careful study of immense amounts of data will be required. Data science and machine learning must become an integral part of genomics to fully realize the potential of CRISPR, GWAS and other genomic studies in the coming decade.


Speaker:

Kathryn Roeder is the UPMC Professor of Statistics and Life Sciences in the Departments of Statistics & Data Science and Computational Biology. She earned her Ph.D. in statistics at Pennsylvania State University, after which she was on the faculty at Yale University for the six years before coming to Carnegie Mellon University in 1994. In 1997 she received the COPSS Presidents' Award for the outstanding statistician under age 40. In 2020 she was awarded the COPSS Distinguished Achievement Award and Lectureship. In 2019 she was inducted into the National Academy of Sciences. Her research group develops statistical tools applied to genetic and genomic data to understand the workings of the human brain, and the interplay with genetic variation. These methods rely on various statistical and machine learning methods, causal inference, latent space embedding, sparse PCA and high dimensional nonparametric techniques.

DATEJuly 8, 2024
SHARE
Related News
    • 0

      Data-driven optimization --- Integrating data sampling, learning, and optimization

      Abstract:Traditionally machine learning and optimization are two different branches in computer science. They need to accomplish two different types of tasks, and they are studied by two different sets of domain experts. Machine learning is the task of extracting a model from the data, while optimization is to find the optimal solutions from the learned model. In the current era of big data and...

    • 1

      Machine intelligence and network science for complex systems big data analysis

      Speaker Dr. Cannistraci is a theoretical engineer and computational innovator. He is a Professor in the Tsinghua Laboratory of Brain and Intelligence (THBI) and an adjunct professor in the Department of Computer Science and in the Department of Biomedical Engineering at Tsinghua University. He directs the Center for Complex Network Intelligence (CCNI) in THBI, which seeks to create pioneering a...