首页|期刊导航|统计与决策|概率样本与非概率样本的整合估计方法

概率样本与非概率样本的整合估计方法

罗世华戴玉芳

统计与决策2026，Vol.42Issue(9)：42-48,7.

统计与决策2026，Vol.42Issue(9)：42-48,7.DOI:10.13546/j.cnki.tjyjc.2026.09.007

概率样本与非概率样本的整合估计方法

Integrated Estimation Method for Probability Samples and Non-Probability Samples

罗世华 ¹戴玉芳¹

作者信息

1. 江西财经大学统计与数据科学学院,南昌 330013
折叠

摘要

Abstract

In traditional probability sample collection,the increase in costs and the decline in response rates result in insuffi-cient valid samples,and the absence of target variables leads to biased estimations.In addition,the unknown sampling probability of non-probabilistic samples in online surveys and the method of estimating the sampling probability of non-probabilistic samples based on the Logistic regression model are sensitive to model specifications,which may lead to extreme probabilities and further cause the problem of high variation in the estimation results.To address the above issues,this paper proposes an integrated esti-mation method that combines probability samples and non-probability samples.Firstly,this method estimates the target variable of the probability samples through XGBoost to obtain the initial estimate,then further estimates the sampling probabilities of the probability samples and the non-probability samples,uses the kernel smoothing method to calculate the similarity of the sampling probabilities of the two types of samples,allocates the weights of the probability samples reasonably to the non-probability sam-ples according to the similarity,and performs weighted estimation of the known target variable based on the estimated weights,and finally,combines the two types of samples into one sample and adjusts the weights of the two types of samples by minimizing the MSE of the combined estimator,thereby achieving the estimation of the overall population.The results of both simulation and em-pirical studies indicate that,under various circumstances,the proposed method outperforms other methods in terms of bias and mean square error,demonstrating significant superiority.

关键词

概率样本/非概率样本/XGBoost/倾向得分加权/核平滑

Key words

probability sample/non-probability sample/XGBoost/propensity score weighting/kernel smoothing

分类

数理科学

引用本文复制引用

罗世华,戴玉芳..概率样本与非概率样本的整合估计方法[J].统计与决策,2026,42(9):42-48,7.

基金项目

江西省研究生创新专项资金项目(YC2023-B179) （YC2023-B179）

江西财经大学第十八届学生科研课题(20231015151904996) （20231015151904996）

统计与决策

OACHSSCD

ISSN：1002-6487

访问量0

下载量0

段落导航