大数据2024,Vol.10Issue(1):185-194,10.DOI:10.11959/j.issn.2096-0271.2024019
小样本数据下特种材料基因工程的数据扩充方法
Data expansion method for genetic engineering of special materials with small sample data
摘要
Abstract
With the increasing diversity and complexity of material requirements for underground water conservancy and water pipeline networks,the efficient and convenient design of special materials to meet individual needs through machine learning has become a hot topic of concern.Traditional supervised learning methods are all based on a large dataset to train models,but obtaining large datasets for special materials required in deeply buried underground water pipeline networks and high-end military equipment,such as rare and high-entropy alloys,etc.requires extremely high cost and a long period.To solve this problem,we propose a small sample expansion model-RX-SMOGN,using XGBoost and RFECV algorithms for feature screening.We enrich the dataset with the SMOGN algorithm.In this paper,the phase structure of high-entropy alloys is used as the research object,and traditional machine learning models are trained to predict them to verify the effectiveness of the RX-SMOGN model.From the results of 5-fold cross-verification and 4 evaluation indicators,it can be seen that the RX-SMOGN model fully improves the performance of the machine learning model,provides a more convenient method for alloy material design,and fully improves the efficiency of alloy material design.关键词
小样本扩充/特征工程/机器学习/高熵合金/稀贵金属Key words
small sample expansion/feature engineering/machine learning/high-entropy alloy/rare precious metal分类
信息技术与安全科学引用本文复制引用
杨涛,张兆波,郑添屹,彭保..小样本数据下特种材料基因工程的数据扩充方法[J].大数据,2024,10(1):185-194,10.基金项目
深圳大学稳定保障计划项目(No.20200829114939001) (No.20200829114939001)
深圳信息职业技术学院校级创新科研团队项目(No.TD2020E001) (No.TD2020E001)
珠三角水资源配置工程科研项目(No.CD88-QT01-2022-0068) Shenzhen University Stability Support Plan(No.20200829114939001),Project of Shenzhen Institute of Information Technology School-level Innovative,Scientific Research Team(No.TD2020E001),The Pearl River Delta Water Resources Allocation Engineering Scientific Research Project(No.CD88-QT01-2022-0068) (No.CD88-QT01-2022-0068)