硅酸盐学报2026,Vol.54Issue(3):842-856,15.DOI:10.14062/j.issn.0454-5648.20250461
基于条件生成对抗网络与马尔可夫链蒙特卡洛双驱动的超高性能混凝土数据增强及智能预测方法
A Novel Approach for Ultra-High Performance Concrete Data Augmentation and Intelligent Prediction Driven by Conditional Generative Adversarial Network-Markov Chain Monte Carlo Model
摘要
Abstract
Introduction Under the dual pressures of global extreme climates and the demand for long service lives in major engineering projects,conventional concrete materials increasingly struggle to meet performance requirements for critical infrastructures like cross-sea bridges and nuclear facilities,highlighting a urgent need for advanced construction materials.Ultra-high performance concrete(UHPC)emerges as a transformative solution for modern engineering challenges due to its exceptional compressive strength(>120 MPa),durability,and lightweight design advantages.However,UHPC development faces critical barriers,i.e.,the intricate nonlinear relationships among its composition,microstructure,and properties undermine conventional modeling approaches,while the scarcity of high-quality datasets(typically<200 samples)with inconsistent experimental protocols severely limits the reliability of data-driven models.This"algorithm-data mismatch"dilemma necessitates innovative database enhancement strategies. The existing efforts in transfer learning and data augmentation,including generative adversarial networks(GANs)and Kriging interpolation,show a promise but encounter three fundamental limitations,i.e.,physical inconsistency in synthetic data(e.g.,illogical porosity-strength correlations),high-dimensionality failures in interpolation methods,and domain-shift distortions during cross-material knowledge transfer.To address these challenges,this study was to propose a dual-driven data enhancement framework integrating conditional GAN(CGAN)and Markov Chain Monte Carlo(MCMC)techniques.The methodology systematically resolved dimensionality issues through preprocessing,generated physically constrained virtual samples via CGAN under constitutive material relationships,and ensures topological consistency using MCMC statistical correction.A Bayesian-optimized XGBoost model could validate the enhanced dataset,with experimental verification from marine floating platform UHPC applications,demonstrating the framework efficacy in overcoming multi-dimensional data generation constraints.This approach was to establish a robust paradigm for intelligent performance prediction under small-sample conditions,advancing digital design capabilities for concrete structures in complex marine environments. Methods This study developed a CGAN-MCMC model to address small database augmentation and intelligent prediction for UHPC.The process commenced with standardized preprocessing of the original UHPC database,where missing values and outliers were systematically handled through the multiple imputation by chained equation(MICE)method and isolation forest algorithm,respectively,and followed by K-fold cross-validation partitioning.The CGAN-MCMC model was then employed to enhance the UHPC database as distribution comparisons and Kolmogorov-Smirnov(KS)tests conducted to verify the homogeneity between the original and augmented databases.Subsequently,a Bayesian optimization(BO)-tuned XGBoost model was implemented to evaluate performance prediction capabilities by both pre-and post-augmentation datasets.The model validity was further confirmed through validation against an actual UHPC engineering dataset from a floating platform project in Guangdong Province,China,demonstrating its practical applicability and robustness in engineering scenarios. Results and discussion The data standardization effectively eliminates feature scale discrepancies,mitigating overfitting risks caused by parameter weight bias.The MICE method significantly enhances model noise resistance and prediction stability,establishing a high-reliability data foundation for UHPC performance prediction models.The isolation forest(IF)algorithm effectively removes outliers induced by instrument drift or mixing errors,while improving dataset robustness.The K-fold cross-validation further strengthenes model generalization via comprehensively evaluating performance across diverse data subsets.The optimized dataset demonstrates marked improvements in prediction accuracy and consistency across all performance metrics. The CGAN-MCMC model is applied to augment the preprocessed UHPC database(N=145),generating an enhanced dataset(N=1,500).A statistical homogeneity between original and augmented data is confirmed through comparative analysis of distribution patterns,Pearson correlation matrices,and Kolmogorov-Smirnov(KS)tests(p-values>0.05).A Bayesian optimization(BO)-tuned XGBoost model reveals substantial improvements in test-set R2 values for 3-day hydration heat(62%),median pore size(55%),compressive strength(18%),flexural strength(46%)and workability(35%)when using augmented data,quantitatively validating the CGAN-MCMC enhancement methodology. The validation with engineering data from a floating platform project in Guangdong Province,China,demonstrates an enhanced model performance(i.e.,test-set R2 increases from 0.65 to 0.92,with mean absolute error(MAE)reductions of 35%(training)and 50%(testing),alongside root mean squared error(RMSE)decreases of 39%(training)and 48%(testing)).These results empirically confirm an engineering feasibility of the proposed framework.The proposed methodology advances UHPC intelligent design through improved data quality and model generalization,providing an innovative solution for performance optimization. Conclusions This study demonstrated a transformative potential of machine learning in overcoming data limitations and enhancing UHPC performance prediction.The study established a paradigm-shifting approach for UHPC design optimization via integrating advanced data augmentation with optimized ML modeling.The CGAN-MCMC framework could resolve critical challenges in material informatics and provide actionable insights for developing sustainable,high-performance concrete systems,accelerating the digital evolution of construction materials engineering.关键词
人工智能/超高性能混凝土/数据增强/机器学习/智能预测Key words
artificial intelligence/ultra-high performance concrete/data augmentation/machine learning/intelligent prediction分类
信息技术与安全科学引用本文复制引用
余睿,李旺,范定强..基于条件生成对抗网络与马尔可夫链蒙特卡洛双驱动的超高性能混凝土数据增强及智能预测方法[J].硅酸盐学报,2026,54(3):842-856,15.基金项目
国家自然科学基金(52178249) (52178249)
广东省自然科学基金青年提升项目(2024A1515030243) (2024A1515030243)
武汉市知识创新专项(2023010201010094). (2023010201010094)