统计与决策2025,Vol.41Issue(5):43-48,6.DOI:10.13546/j.cnki.tjyjc.2025.05.007
超高维数据的特征筛选方法研究
Research on Feature Screening Approach of Ultrahigh Dimensional Data
摘要
Abstract
This paper firstly proposes a new marginal screening method(FMAS-SIS)for ultrahigh dimensional data.In the method,slice-fusion technology is used to transform continuous variable slices into discrete variables,and different slicing schemes are fused,which can effectively deal with classified,discrete and continuous response variables.Secondly,under certain regularity conditions,the sure screening property and ordering consistency of the method is proved.Finally,based on numerical simulation and practical cases,the proposed method is compared with other screening methods to demonstrate the limited sample performance of the proposed method.Comprehensively,FMAS-SIS has the following advantages:First,it is a nonparametric mod-el-free method that do not depend on model assumptions;second,only involving empirical estimation of conditional distribution functions,the calculation is simple and easy to implement;third,it still has excellent screening performance even if the predictor variables,random errors are heavy-tailed,or the predictor variables are strongly correlated,or outliers are presented;fourth,it is unsensitive to the slicing scheme.关键词
超高维数据/特征筛选/切片-融合技术/确定筛选性Key words
ultrahigh dimensional data/feature screening/slice-fusion method/determine screening property分类
数理科学引用本文复制引用
闫彤,刘祎..超高维数据的特征筛选方法研究[J].统计与决策,2025,41(5):43-48,6.基金项目
国家自然科学基金资助项目(11801567) (11801567)
山东省自然科学基金资助项目(ZR2024QA018) (ZR2024QA018)