| 注册
首页|期刊导航|国防科技大学学报|卫星领域语料库构建与命名实体识别

卫星领域语料库构建与命名实体识别

徐聪 石会鹏 陈志敏 张鑫宇 王静 杨甲森

国防科技大学学报2024,Vol.46Issue(4):175-183,9.
国防科技大学学报2024,Vol.46Issue(4):175-183,9.DOI:10.11887/j.cn.202404019

卫星领域语料库构建与命名实体识别

Satellite domain corpus construction and named entity recognition

徐聪 1石会鹏 2陈志敏 3张鑫宇 1王静 3杨甲森3

作者信息

  • 1. 中国科学院国家空间科学中心 复杂航天系统电子信息技术重点实验室,北京 100190||中国科学院大学,北京 100049
  • 2. 国家无线电监测中心检测中心,北京 100041
  • 3. 中国科学院国家空间科学中心 复杂航天系统电子信息技术重点实验室,北京 100190
  • 折叠

摘要

Abstract

Aiming at the lack of named entity corpus in the satellite domain and the low recognition performance of existing algorithms,a satellite domain entity labeling method considering fuzzy boundaries was proposed,constructed a corpus containing 8 common satellite domain entities where the granularity was finer and the coverage was wider in comparison with the existing corpora in this field.Based on this,a transfer learning and multi-network fusion satellite domain entity recognition algorithm was proposed.Algorithm used pretrained bidirectional encoder representations for transformers to smoothly transfer the semantics of the corpus for subword-level features,a BiLSTM(bi-directional long-short term memory)network for capturing contextual information to determine boundaries,and label prediction was achieved using a conditional random field as a decoder.Experimental results show that,compared with traditional models such as BiLSTM,the proposed algorithm has better recognition performance where the F1-score in 8 entities is all above 92%and the micro-average F1-score reaches96.10%.

关键词

命名实体识别/迁移学习/神经网络/数据稀缺

Key words

name entity recognition/transfer learning/neural networks/data scarcity

分类

航空航天

引用本文复制引用

徐聪,石会鹏,陈志敏,张鑫宇,王静,杨甲森..卫星领域语料库构建与命名实体识别[J].国防科技大学学报,2024,46(4):175-183,9.

基金项目

中国科学院复杂航天系统电子信息技术重点实验室择优基金资助项目(Y42613A32S) (Y42613A32S)

国防科技大学学报

OA北大核心CSTPCD

1001-2486

访问量0
|
下载量0
段落导航相关论文