超高维数据下部分线性可加分位数回归模型的变量选择OA北大核心CHSSCDCSSCICSTPCD
Variable Selection for Partial Linear Additive Quantile Regression Model Under Ultra-high-dimensional Data
在超高维数据中,一方面,协变量的维数可能远远大于样本量,甚至随着样本量以指数级的速度增长;另一方面,超高维数据通常是异质的,协变量对条件分布中心的影响可能与他们对尾部的影响大不相同,甚至会出现重尾以及异常点的复杂情况.文章在协变量维度发散且为超高维的情况下研究了部分线性可加分位数回归模型的变量选择和稳健估计问题.首先,为了实现模型的稀疏性和非参数光滑性,引入了一种非凸Atan双惩罚,并采用分位迭代坐标下降算法来解决所提方法的优化问题.在选择适当正则化参数的情况下,证明了所提双惩罚估计量的理论性质.其次,通过模拟研究对所提方法的性能进行验证.模拟结果表明,所提方法比其他惩罚方法具有更好的表现,尤其是在数据存在重尾的情况下.最后,通过基于癌症筛查病人血液样本数据的实证来验证所提方法的实用性.
In ultrahigh dimensional data,on the one hand,the dimensionality of covariates may be much larger than the sam-ple size,even growing exponentially with the sample size;on the other hand,ultrahigh dimensional data are typically heteroge-neous,where the influence of covariates on the center of the conditional distribution may differ greatly from their influence on the tails,leading to complex situations such as heavy tails and outliers.This paper investigates variable selection and robust estima-tion of partial linear additive quantile regression models under the condition of divergence of covariate dimension and ultrahigh di-mension.Firstly,in order to achieve model sparsity and nonparametric smoothness,a non-convex Atan double penalty is intro-duced,and the proposed optimization problem is solved by using a quantile iterative coordinate descent algorithm;the theoretical properties of the proposed double penalty estimator are demonstrated under the selection of appropriate regularization parameters.Subsequently,the performance of the proposed method is verified through simulation studies.The simulations results indicate that the proposed method outperforms other penalty methods,especially in the case of data with heavy tails.Finally,the practicality of the proposed method is verified through empirical analysis of blood sample data from cancer screening patients.
白永昕;钱曼玲;田茂再
北京信息科技大学 理学院,北京 100192墨尔本大学 数学与统计学院,澳大利亚 墨尔本 3010中国人民大学应用统计科学研究中心,北京 100872||新疆财经大学 统计与信息学院,乌鲁木齐 830012||昌吉大学 数学与数据科学学院,湖南 昌吉 831100
数学
超高维数据分位数回归部分线性可加变量选择Atan双惩罚
ultrahigh dimensional dataquantile regressionpartial linear additivityvariable selectionAtan double penalty
《统计与决策》 2024 (009)
43-48 / 6
北京市自然科学基金资助项目(1242005);北京信息科技大学校科研基金资助项目(2022XJJ31)
评论