计算机应用研究2026,Vol.43Issue(4):1171-1179,9.DOI:10.19734/j.issn.1001-3695.2025.07.0292
基于自适应高斯混合模型的数据库基数估计方法
Database cardinality estimation method based on adaptive Gaussian mixture model
摘要
Abstract
Cardinality estimation is a critical component of database query optimization,where its accuracy directly impacts the execution efficiency of query plans.Deep autoregressive model-based cardinality estimators have demonstrated remarkable accuracy in prior studies.However,they struggle to capture data distribution patterns when handling large-domain continuous attributes,which lead to significant performance degradation.To address these challenges,this paper proposed a novel cardi-nality estimator based on an adaptive Gaussian mixture model,called AGCard.It first dynamically adjusted the number and parameters of Gaussian components to adaptively fit the data distribution of continuous attributes,thereby reducing the domain scale.Subsequently,AGCard employed a bias correction algorithm to compensate for the estimation deviations introduced by the progressive sampling process while avoiding additional computational overhead.Extensive experiments on three real-world datasets(including WISDM)demonstrate that the proposed method outperforms existing mainstream baselines in terms of esti-mation accuracy,inference latency,and storage overhead.The results confirm the effectiveness of the adaptive Gaussian mix-ture model and the bias correction algorithm.关键词
查询优化/基数估计/自适应高斯混合模型/自回归模型/偏差校正Key words
query optimization/cardinality estimation/adaptive Gaussian mixture model/autoregressive model/bias correc-tion分类
信息技术与安全科学引用本文复制引用
李昊,刘梦赤,邹瑞基,刘明凯..基于自适应高斯混合模型的数据库基数估计方法[J].计算机应用研究,2026,43(4):1171-1179,9.基金项目
国家自然科学基金资助项目(61672389) (61672389)
广州市大数据智能教育重点实验室(201905010009) (201905010009)