基于三重混合采样和集成学习的潜在高价值旅客发现OACSTPCD
POTENTIAL HIGH-VALUE PASSENGER DISCOVERY BASED ON SSOMAJ-SMOTE-SSOMIN SAMPLING AND ENSEMBLE LEARNING
考虑潜在高价值旅客特有的数据高度不平衡、旅客特征和价值类别弱相关等问题,提出一种基于三重混合采样和集成学习的潜在高价值旅客发现模型.采用RFM(Recency Frequency Monetary)方法标注旅客类别;使用三重混合采样对不平衡旅客数据集进行重采样;使用融合特征选择算法遴选旅客特征;使用梯度提升决策树作为分类器,构建旅客价值预测模型,识别潜在高价值旅客.在PNR数据集上的实验结果表明,与基准算法相比,该模型能取得更好的AUC值和F1值,可以较好地识别潜在高价值旅客.
Considering highly-imbalanced data and weak correlation between passenger characteristics and value categories of potential high-value passenger,a potential high-value passenger discovery model based on SSOMaj-SMOTE-SSOMin sampling and ensemble learning is proposed.The RFM method was used to label the passenger category.The SSOMaj-SMOTE-SSOMin method was used to resample the imbalanced passenger data set.The fusion feature selection algorithm(FFS)was used to select the passenger features.Gradient boosting decision tree(GBDT)was taken as the classifier to build a passenger value prediction model to identify potential high-value passengers.Compared with the baseline algorithm,the experimental results on the PNR data set show that the proposed model achieves better AUC value and F1 value,and can better identify potential high-value passengers.
冯霞;胡昉
中国民航大学计算机科学与技术学院 天津 300300||中国民航信息技术科研基地 天津 300300中国民航大学计算机科学与技术学院 天津 300300
计算机与自动化
航空运输三重混合采样特征重要性排序潜在高价值旅客不平衡分类集成学习
Air transportationSSOMaj-SMOTE-SSOMinFeature importance rankingPotential high value passengerImbalanced classificationEnsemble learning
《计算机应用与软件》 2024 (001)
12-17,35 / 7
国家自然科学基金项目(61502499);中国民航大学科研基金项目(2013QD18X);民航旅客服务智能化应用技术重点实验室项目.
评论