基于机器学习算法的服装直播销量预测模型OA北大核心CSTPCD
Apparel livestreaming sales forecasting models based on machine learning algorithms
为了完善直播销量预测的指标体系,提高直播销量预测的准确率,本文对比多种机器学习算法,分析服装直播销量的影响因素并预测服装直播销量.首先,通过文献分析选取直播销量的影响因素,并使用Spearman相关系数结合显著性进行筛选.其次,通过不同机器学习算法建立服装销量预测模型.选用R2、MAE、RMSE及MAPE为评价指标,采用5折交叉验证的方法,测试各模型性能.研究结果表明:主播粉丝数、主播近30天场均观看人次、主播近30天场均坑产、产品价格、产品讲解时长、产品近30天历史销量、品牌粉丝数、品牌近30天历史销量、折扣这9个因素之间共线性较弱且与直播销量之间的相关性显著,可作为预测模型中的影响因素;预测算法中K近邻算法和随机森林算法的表现较好,R2均大于0.98,MAPE均在30.5%以内.预测结果可帮助零售商规划库存,调整生产计划,为产品采购、定价、推广提供数据支持.
With the dramatic increase in the scales of e-commerce livestreaming,the number of e-commerce livestreaming users has reached 48.8%of the overall Internet users in China.The huge supply demand requires live e-commerce stores to improve their dispatching efficiency and reduce inventory.Therefore,in order to avoid retailers'profit loss,it is necessary to find a more accurate method to predict livestreaming sales. The sales prediction methods mainly include traditional statistical methods and machine learning algorithms.Due to the instability of livestreaming sales and the large number of influencing factors,traditional statistical methods often fail to predict the sales accurately.To complete the index system of livestreaming sales prediction and improve the accuracy of livestreaming sales prediction,this paper adopted a variety of machine learning algorithms(BP neural network,decision tree(DT),random forest(RF),K-nearest neighbor(KNN),and support vector machine(SVM)),analyzed the influencing factors of apparel livestreaming sales,predicted apparel livestreaming sales,and selected the best performing algorithms.The detailed research process is as follows.Firstly,17 influencing factors of livestreaming sales were selected through literature review,and nine most important influencing factors were selected by using Spearman's correlation coefficient combined with significance.Secondly,different machine learning algorithms were used to establish clothing sales prediction models,and the method of 5-fold cross-validation was adopted to initially screen out three algorithms(RF,KNN and SVM)with high and stable model fit with R2 as an indicator.Finally,the parameters of the three algorithms were optimized,and then,three prediction models were constructed.R2,MAE,RMSE and MAPE were used as evaluation indexes,and the optimal algorithms were selected by using the method of 5-fold cross-validation to test the performance of each model. The results of the study show that:the multicollinearity between the nine factors(number of fans of the anchor,average number of viewers of the anchor in the last 30 days,average pit output of the anchor in the last 30 days,product price,duration of product explanation,historical sales of the product in the last 30 days,number of fans of the brand,historical sales of the brand in the last 30 days,and discounts)is weak and their correlation with the livestreaming sales is significant.Therefore,these nine factors can be used as influencing factors in the prediction model.Among the influencing factors,the correlation among product sales in the last 30 days,the duration of product explanation and livestreaming sales is the highest.In the meanwhile,the prediction algorithms,KNN and RF perform better,with R2 being greater than 0.98 and MAPE within 30.5%.Compared with the KNN algorithm,the RF algorithm is more stable,and its R2,RMSE and MAE perform better than those of the KNN algorithm.But the MAPE of the KNN algorithm is smaller than that of the RF algorithm,for which the possible reason is that the KNN algorithm is more accurate in predicting low sales items,and the relative error is smaller.According to the result of 5-fold cross-validation,the RF algorithm is more stable compared with the KNN algorithm,and the possible reason is that the KNN algorithm is more suitable for the dataset with more similar data features.Therefore,RF can be used as the main prediction algorithm in practical applications to ensure the stability of the overall sales trend prediction.In predicting the sales of the same brand or the same category,the similarity between the data is higher,and then the KNN algorithm can be considered for prediction. This paper compares the performance of various prediction algorithms on livestreaming sales prediction,optimizes the parameters and improves the accuracy of livestreaming sales prediction.The prediction results can help retailers make inventory planning,adjust production schedules,develop marketing strategies,and provide data support for product purchasing,pricing,and promotion.Due to the fact that only some of the easily quantifiable influencing factors are explored in this paper and the sample distribution is limited,future research can expand the scope of sample selection and further improve the predictive indicator system to achieve more accurate predictions.
韩铂;李沛
上海工程技术大学纺织服装学院,上海 201620
轻工业
直播销量预测机器学习随机森林K近邻SVM支持向量机五折交叉验证
live-streaming sales forecastingmachine learningrandom forest(RF)K-nearest neighbors(KNN)support vector machine(SVM)5-fold cross-validation
《丝绸》 2024 (007)
109-117 / 9
评论