摘要
Abstract
With the rapid development of cloud computing and network technologies,the cloud-network environment is becom-ing increasingly complex,and traditional monitoring methods have become difficult to meet the requirements of efficient and ac-curate observability.To this end,a large model-driven agent monitoring scheme is proposed,aiming to achieve non-intrusive collection,intelligent analysis,fusion optimization and adaptive scheduling of full-stack data.The proposed scheme achieves re-al-time collection and intelligent analysis of application,network,system and hardware data by integrating cutting-edge technol-ogies such as self-supervised learning,contrastive learning,deep learning models based on Transformer architecture,graph neural network(GNN),and extended Berkeley packet filtering.By further integrating non-intrusive optimization of GPU com-puting performance and multi-source data fusion technology,the system stability,performance monitoring and operation and maintenance efficiency have been effectively enhanced.The proposed solution has certain advantages in fields such as cloud computing platforms,AI training,large-scale network monitoring,and intelligent operation and maintenance of data centers,providing a brand-new idea for building an efficient and intelligent cloud-network monitoring system.关键词
云网监测/智能体/大模型/全栈可观测性/无侵入式数据采集/GPU性能优化Key words
cloud-network monitoring/agent/large models/full-stack observability/non-intrusive data collection/GPU per-formance optimization分类
信息技术与安全科学