计算机应用研究2024,Vol.41Issue(11):3288-3294,7.DOI:10.19734/j.issn.1001-3695.2024.03.0083
基于多层级注意力机制和动态阈值的远程监督关系抽取
Distant supervision relation extraction based on multi-level attention mechanism and dynamic threshold
摘要
Abstract
Distant supervision relation extraction faces the problem of data quality,that is,the generated dataset has multiple types of noise,noisy words,noisy sentences and noisy bags.Existing research mainly focuses on the noisy sentences,ignoring the impact of other noise,and cannot completely eliminate the noise.To this end,the paper proposed a distant supervision re-lation extraction model based on multilevel attention mechanism and dynamic thresholding(MADT).The model firstly used a pre-trained language model to obtain entity-pair semantic representations,then obtained semantic features embedded with key-word information through a bidirectional gated recurrent unit and a self-attention mechanism,and then dealt with the three noise problems sequentially in conjunction with the deep contextual representation of the sentence.In addition,the paper pro-posed a dynamic thresholding method to further remove noisy sentences,enhance the contribution of positive example sentences to the bag representation,and reduce the impact of noisy bags using a semantic similarity-based attention mechanism.Experi-ments on the NYT10d and NYT10m datasets show that the MADT model is able to address all levels of noise in distant supervi-sion of relation extraction and effectively improve relation extraction performance.关键词
远程监督关系抽取/自注意力机制/动态阈值/预训练语言模型/降噪Key words
distant supervision relation extraction/self-attention mechanism/dynamic threshold/pre-trained language mo-del/noise reduction分类
信息技术与安全科学引用本文复制引用
赵红燕,张莹刚,谢斌红..基于多层级注意力机制和动态阈值的远程监督关系抽取[J].计算机应用研究,2024,41(11):3288-3294,7.基金项目
山西省基础研究计划资助项目(202203021211199) (202203021211199)
智能信息处理山西省重点实验室开放基金资助项目(CICIP2022004) (CICIP2022004)
太原科技大学博士科研启动基金资助项目(20212075) (20212075)