Abstract
Artificial intelligence (AI) based drug design has demonstrated great potential to fundamentally change the pharmaceutical industries. However, a key issue in all AI-based drug design models is efficient molecular representation and featurization. Recently, topological data analysis (TDA) has been used for molecular representations and its combination with machine learning models have achieved great successes in drug design. In this talk, we will introduce our recently proposed persistent models for molecular representation and featurization. In our persistent models, molecular interactions and structures are characterized by various topological objects, including hypergraph, Dowker complex, Neighborhood complex, Hom-complex. Then mathematical invariants can be calculated to give quantitative featurization of the molecules. By considering a filtration process of the representations, various persistent functions can be constructed from the mathematical invariants of the representations through the filtration process, like the persistent homology, persistent spectral and persistent Tor-algebra. These persistent functions are used as molecular descriptors for the machine learning models. The state-of-the-art results can be obtained by these persistent functions based machine learning models.