南阳师范学院学报2024,Vol.23Issue(3):60-65,6.
基于图匹配的Web实体抽取算法研究
Research on Web entity extraction algorithm based on graph matching
摘要
Abstract
There are a lot of missing,inconsistent and imprecise data in the Web,and the traditional search en-gines can only return document fragments based on keywords,but can not directly obtain the target entities.In this paper,a new graph-based entity extraction algorithm GMEE(Graph Matching Based Entity Extraction)is proposed.Firstly,the fragments are divided into words to filter the entities.And then the"Weighted semantic entity association graph"is established according to the structure and semantic relationship between the entities.Finally,the target entities are extracted by using the"maximum common subgraph matching"strategy.The ex-perimental results show that the algorithm proposed in this paper can effectively reduce the extracted entity set without a large number of parameter training and transmission,which not only ensures the recall and accuracy,but also improves the interpretability of the extraction process.关键词
图匹配/实体抽取/Web/搜索引擎Key words
graph matching/entity extraction/Web/search engine分类
信息技术与安全科学引用本文复制引用
徐曜..基于图匹配的Web实体抽取算法研究[J].南阳师范学院学报,2024,23(3):60-65,6.基金项目
安徽省高等学校质量工程项目(2021FWXXCY034) (2021FWXXCY034)