情报杂志2025,Vol.44Issue(5):91-103,13.DOI:10.3969/j.issn.1002-1965.2025.05.012
基于混合检索增强生成大语言模型的网络舆情多任务分析
Multi-Task Analysis of Online Public Opinion Based on Hybrid Retrieval-Augmented Generation of Large Language Model
摘要
Abstract
[Research purpose]To alleviate the"hallucination"phenomenon that the output of the large language model is inconsistent with the reality or incorrect in the process of online public opinion analysis,and deal with the issue that the fine-tuning process of the large language model usually requires consuming a lot of machine resources,this study proposes a hybrid retrieval-augmented generation(RAG)framework to enhance the natural language understanding and generation capabilities of large language model,achieving cross do-main and cross task analysis of online public opinion.[Research method]First,the local RAG knowledge base is built by integrating high-quality public opinion event data sets to adapt to different fields,avoiding large-scale parameter fine-tuning,and retrieval knowl-edge is introduced in the generation process to alleviate the hallucination phenomenon.In addition,contextual information and prompt ex-amples are introduced to enhance knowledge and improve the accuracy of the large language model in downstream tasks.Finally,a prompt statement for the public opinion analysis task is constructed by combining retrieval content,context information,and prompt samples,fa-cilitating the transfer of knowledge from the source domain to the target domain.[Research result/conclusion]The experimental results show that the hybrid RAG framework has excellent performance in various cross domain and cross language datasets such as misinformation detection,automated report generation,intelligent question answering,information extraction,and sentiment analysis,and can adapt to scenarios of few samples learning.关键词
网络舆情/舆情分析/大语言模型/检索增强生成(RAG)/提示学习Key words
online public opinion/public opinion analysis/large language model/retrieval-augmented generation/prompt learning分类
社会科学引用本文复制引用
王润周,张新生,王明虎,苏佳,马玉龙..基于混合检索增强生成大语言模型的网络舆情多任务分析[J].情报杂志,2025,44(5):91-103,13.基金项目
陕西省社会科学基金项目"AIGC时代下生成式虚假信息风险感知及治理路径研究"(编号:2024R083) (编号:2024R083)
陕西省重点产业创新链(群)-工业领域项目"新基建背景下的非遇传播及传承关键技术研发与应用示范"(编号:2022ZDLGY06-04)研究成果. (群)