清华主页 EN
导航菜单

Quadric hypersurface intersection for manifold learning in feature space

来源: 12-16

时间:2022-12-16 Fri 10:30-11:30

地点:Zoom: 293 812 9202(PW: BIMSA)

组织者:Jie Wu, Jingyan Li, Xiang Liu, Fedor Pavutnitskiy

主讲人:Fedor Pavutnitskiy BIMSA

Abstract

The knowledge that data lies close to a particular submanifold of the ambient Euclidean space may be useful in a number of ways. For instance, one may want to automatically mark any point far away from the submanifold as an outlier, or to use its geodesic distance to measure similarity between points. Classical problems for manifold learning are often posed in a very high dimension, e.g. for spaces of images or spaces of representations of words. Today, with deep representation learning on the rise in areas such as computer vision and natural language processing, many problems of this kind may be transformed into problems of moderately high dimension, typically of the order of hundreds. Motivated by this, we propose a manifold learning technique suitable for moderately high dimension and large datasets. The manifold is learned from the training data in the form of an intersection of quadric hypersurfaces — simple but expressive objects. At test time, this manifold can be used to introduce an outlier score for arbitrary new points and to improve a given similarity metric by incorporating learned geometric structure into it.

返回顶部
相关文章
  • Bayesian machine learning

    Record: YesLevel: GraduateLanguage: EnglishPrerequisiteProbability theory, Mathematical statistics, Machine learningAbstractProbabilistic approach in machine and deep learning leads to principled solutions. It provides explainable decisions and new ways for improving of existing approaches. Bayesian machine learning consists of probabilistic approaches that rely on Bayes formula. It can help in...

  • Manifold learning for noisy and high-dimensional datasets: challenges and some solutions

    Abstract:Manifold learning theory has garnered considerable attention in the modeling of expansive biomedical datasets, showcasing its ability to capture data essence more effectively than traditional linear methodologies. Nevertheless, prevalent algorithms are primarily designed for low-dimensional and clean datasets, whereas contemporary biomedical datasets tend to be high-dimensional and no...