首页|期刊导航|计算机工程与应用|基于关键词的维吾尔单文档自动文摘技术研究

基于关键词的维吾尔单文档自动文摘技术研究

买哈铺热提·外力赵梦原艾斯卡尔·艾木都拉

计算机工程与应用Issue(16)：130-135,6.

计算机工程与应用Issue(16)：130-135,6.DOI:10.3778/j.issn.1002-8331.1308-0372

基于关键词的维吾尔单文档自动文摘技术研究

买哈铺热提·外力 ¹赵梦原 ²艾斯卡尔·艾木都拉¹

作者信息

1. 新疆大学信息科学与工程学院，乌鲁木齐 830046
2. 清华大学语音和语言技术研究中心，北京 100086
折叠

摘要

Abstract

As represented by the Internet, development of information technology has enabled people to obtain information easier than ever before, but it also presents challenges to the effective use of information. Automatic summarization techniques greatly improve efficiency in the use of information by automatically selecting representatives of the sentences in the document. In recent years, automatic summarization techniques based on English and Chinese received wide attention and achieved significant progress while the automatic summarization of minority languages is not sufficient, such as Uyghur language. This paper constructs a Uyghur-oriented automatic summarization system. Uyghur linguistic knowledge is used to handle the document, and then keywords which are extracted from the document is used for automatic text summarization. Two different TF-IDF-based and TextRank-based extraction algorithms are compared;it proves TextRank method is more suitable for automatic text summarization. It is demonstrated that on the premise of full account of Uygur language infor-mation, automatic text summarization based on keywords can achieve satisfactory results.

关键词

维吾尔文/自动文摘/TF-IDF算法/Textrank/ROUGE

Key words

Uyghur/automatic summarization/TF-IDF algorithm/TextRank/ROUGE

分类

信息技术与安全科学

引用本文复制引用

买哈铺热提·外力,赵梦原,艾斯卡尔·艾木都拉..基于关键词的维吾尔单文档自动文摘技术研究[J].计算机工程与应用,2015,(16):130-135,6.

基金项目

国家自然科学基金项目（No.6163033）；新疆维吾尔自治区高新技术发展研究计划（No.201312103）；教育部新世纪优秀人才支持计划项目（No.NCET-10-0969）。（）

计算机工程与应用

OA北大核心CSCDCSTPCD

ISSN：1002-8331

访问量0

下载量0

段落导航