| 注册
首页|期刊导航|南京大学学报(自然科学版)|基于双向注意力流和自注意力结合的机器阅读理解

基于双向注意力流和自注意力结合的机器阅读理解

顾健伟 曾诚 邹恩岑 陈扬 沈艺 陆悠 奚雪峰

南京大学学报(自然科学版)2019,Vol.55Issue(1):125-132,8.
南京大学学报(自然科学版)2019,Vol.55Issue(1):125-132,8.DOI:10.13232/j.cnkij.nju.2019.01.013

基于双向注意力流和自注意力结合的机器阅读理解

Research on machine reading comprehension task based on BiDAF with self-attention

顾健伟 1曾诚 2邹恩岑 3陈扬 1沈艺 1陆悠 2奚雪峰1

作者信息

  • 1. 苏州科技大学电子与信息工程学院,苏州,215009
  • 2. 苏州市虚拟现实智能交互及应用技术重点实验室,苏州,215009
  • 3. 昆山市公安局指挥中心,苏州,215300
  • 折叠

摘要

Abstract

Machine Reading Comprehension (MRC)is always the research hotspot and core problem in Natural Language Processing(NLP).How to make the machine get close to human understanding will be the continuous research goal before the arrival of the intelligent era.Recently,Baidu released a large open-source Chinese reading comprehension data set DuReader,which aims to handle real-life RC (Reading Comprehension )issues.This large-scaleQA(questionandanswer)datasetismorepracticaland moredifficultthanever.Notlongago,attention mechanismhasbeensuccessfullyextendedto NLP.Typically,these methodsuseattentiontofocusonasmall portionofthecontextandsummarizeitwithafixed-sizevector,coupleattentionstemporally,andoftenform a uni-directionalattention.InviewoftheexcellenteffectofattentionmechanismappliedinthefieldofNLP,westudy andusetheBi-DirectionalAttentionFlow(BiDAF)withself-attentionnetworktodealwiththe MRCtaskinthis paper.Byusingthemodel,thequery-awarecontextrepresentationcanbeobtainedandthegranularitycanalsobe classified.Wealso useself-attention mechanism tocapture word dependenciesand syntaxinformationinthe sentencesoftextandquestions.Thisstepcanreducesemanticlossofsentencesduringinformationaggregation. Thenweaggregatesemanticinformationbybi-LSTM(LongShort-Term Memory)togettheinformation matrix whichisusedtopredictthefinalanswer.Aftertraining,weobtaintheresultthatpercentageofidenticalwords (BLEU-4)is44.7% andpercentageofoverlappingunits(Rouge-L)is49.1%,wherehumanaveragelevelare55.1% and54.4% respectively.Thereisstillacertaingapbetweentheexperimentalresultsandthehumanlevelbutitis notverylarge,indicatingthatthemethodiseffectiveandscalable.

关键词

中文机器阅读理解/DuReader数据集/BiDAF模型/自注意力机制

Key words

MRC/DuReader/BiDAF/self-attention

分类

信息技术与安全科学

引用本文复制引用

顾健伟,曾诚,邹恩岑,陈扬,沈艺,陆悠,奚雪峰..基于双向注意力流和自注意力结合的机器阅读理解[J].南京大学学报(自然科学版),2019,55(1):125-132,8.

基金项目

国家自然科学基金(61673290,61728205,61750110534),江苏省研究生实践创新计划(SJCX17_0681),苏州市科技发展计划产业前瞻性项目(SYG201707,SYG201817) (61673290,61728205,61750110534)

南京大学学报(自然科学版)

OACSCDCSTPCD

0469-5097

访问量0
|
下载量0
段落导航相关论文