| 注册
首页|期刊导航|电讯技术|基于生成式LLM的开源情报分析方法

基于生成式LLM的开源情报分析方法

成磊峰 罗吉 王磊 朱敏 陶思彤

电讯技术2025,Vol.65Issue(10):1545-1550,6.
电讯技术2025,Vol.65Issue(10):1545-1550,6.DOI:10.20079/j.issn.1001-893x.250204001

基于生成式LLM的开源情报分析方法

An Open Source Intelligence Analysis Method Based on Generative LLM

成磊峰 1罗吉 2王磊 3朱敏 4陶思彤2

作者信息

  • 1. 四川大学 计算机学院,成都 610065||西南电子技术研究所,成都 610036
  • 2. 中电信数智科技有限公司,北京 100001
  • 3. 西南电子技术研究所,成都 610036
  • 4. 四川大学 计算机学院,成都 610065
  • 折叠

摘要

Abstract

The authors propose a method integrating generative large language models(LLMs),XPath,and retrieval-augmented generation(RAG)for web page information extraction in open-source intelligence analysis.Key innovations include a dynamic templated prompting strategy and multi-granularity semantic retrieval.The dynamic templates generate domain-constrained prompts based on intelligence types(events/persons/organizations),enhancing entity extraction accuracy.The multi-granular retrieval establishes a document-paragraph-entity hierarchy optimized by the BERT-Topk algorithm for fragmented long-text information.By aligning entities with OpenKG,a three-dimensional attribute-relation-event network is constructed to strengthen complex event analysis.Experiments on ClueWeb22 and TAC-KBP2022 datasets show the extraction rate is 0.85 and the response accuracy is 0.78,outperforming traditional RAG by 18%~31%.In practical applications,92%key fact accuracy is achieved in event briefings with a total cost of only 12%of GPT-4.

关键词

开源情报分析/网页信息提取/生成式大语言模型/检索增强生成

Key words

open source intelligence analysis/web information extraction/generative large language model/retrieval-augmented generation

分类

信息技术与安全科学

引用本文复制引用

成磊峰,罗吉,王磊,朱敏,陶思彤..基于生成式LLM的开源情报分析方法[J].电讯技术,2025,65(10):1545-1550,6.

电讯技术

OA北大核心

1001-893X

访问量0
|
下载量0
段落导航相关论文