清华主页 EN
导航菜单

Topics in Statistics and Data Science

来源: 09-01

时间:Tues./Wed. 15:20-16:55pm, Sept.13-Dec.12, 2022

地点: Zoom Meeting ID: 276 366 7254 ; Passcode: YMSC

主讲人:Yannis Yatracos

Description:

Research results of the Instructor over the years, including more recent on Foundations of Data Science for Algorithmic (Black-Box) models, i.e. without model assumptions for the data.

Topics include: Cluster and Structures detection with the Variance Components Split (VCS) method. Application in the separation of cryptocurrencies from other assets.

EDI-graph: A Tool to determine, via Expected P-values, almost sure identifiability and discrimination of parameters for Black-Box models. Application in the selection of Data-Generating Machines, and in particular Learning Machines.

Residuals Influence Index (RINFIN) is introduced in linear least squares regression of Y on X, with components measuring the local influence of x in the residual and large value flagging a bad leverage case. Large sample properties of RINFIN are presented. Applications with microarray data and simulated high dimensional data.

Pathologies of the Bootstrap.

Pathologies of the MLE, with correction using Model Updated MLE (MUMLE) with DECK-principle; D=Data E=Evolves, C=Creates, K=Knowledge. Relation of MUMLE with Wallace’s Minimum Message length method.

Pathologies of the Wasserstein distance in Statistical Inference.

Artificially augmented samples, shrinkage and MSE reduction.

Additional topics if time permits:

Elegant Nonparametric Estimation of a density and a regression type function, with rates of convergence in Probability. Matching Estimation of a Black-Box parameter, with convergence rates of the estimates using an extension of Wolfowitz’s Minimum Distance Method. Fiducial Approximate Bayesian Computations (F-ABC).

返回顶部
相关文章
  • Topological Approaches for Data Science I

    Record: YesLevel: GraduateLanguage: ChinesePrerequisiteAlgebraic TopologyAbstractTopological data analysis is a new-born research area that explores topological approaches in data science, where persistent homology has been proved as an effective mathematical tool in data analytics with various successful applications. This course will discuss the mathematical foundations of (higher) topologica...

  • Statistical Topics with Missing Data

    Abstract:In some sense, many issues in statistics can be viewed as being focused on issues involving missing data, from predicting future observations from past observations, to the design and analysis of surveys and experiments, to the understanding of economic models involving instrumental variables, to medical data that are unobservable due to the death of patients. This course will conside...