| 注册
首页|期刊导航|计算机应用与软件|面向韵律层边界自动划分的维吾尔语词性自动标注技术研究

面向韵律层边界自动划分的维吾尔语词性自动标注技术研究

努尔比娅·塔依尔 地里木拉提·吐尔逊 艾斯卡尔·肉孜

计算机应用与软件2011,Vol.28Issue(8):165-168,4.
计算机应用与软件2011,Vol.28Issue(8):165-168,4.

面向韵律层边界自动划分的维吾尔语词性自动标注技术研究

ON TECHNOLOGY OF AUTOMATICALLY TAGGING POS OF UYGHUR SENTENCES ORIENTED TOWARD AUTO-DIVISION OF PROSODIC LAYER BOUNDARY

努尔比娅·塔依尔 1地里木拉提·吐尔逊 1艾斯卡尔·肉孜2

作者信息

  • 1. 新疆大学信息科学与工程学院,新疆,乌鲁木齐,830046
  • 2. 新疆大学数学与系统科学学院,新疆,乌鲁木齐,830046
  • 折叠

摘要

Abstract

An automatic tagging method for POS (part of speech) of Uyghur sentences has been elaborately studied in this paper by taking as the background the automatic division technology on prosodic levels boundary in text analysis module of speech synthesis system. First, according to the characteristics of specific application field, the categories of the POS and their decision rules are confirmed, candidate text sentences are screened out and are manually tagged, and then the probability tables and reference tables of POS are elicited from statistics, at last, the automatic POS tagging on Uyghur is implemented through adopting bigram model on the basis of HMM model. In order to approve the validity of the method presented in this paper, large scale text corpus over 10,000 sentences are selected for the training sample and extra 500 sentences are used as the testing sample in the experiment. Test results show that the study conception used in this paper is feasible and valid.

关键词

维吾尔语/韵律层/词性标注/语音合成/文本分析

Key words

Uyghur Prosodic layer POS tagging Speech synthesis Text analysis

分类

信息技术与安全科学

引用本文复制引用

努尔比娅·塔依尔,地里木拉提·吐尔逊,艾斯卡尔·肉孜..面向韵律层边界自动划分的维吾尔语词性自动标注技术研究[J].计算机应用与软件,2011,28(8):165-168,4.

基金项目

新疆维吾尔自治区自然科学基金项目(200821125) (200821125)

新疆大学自然科学基金校院联合项目(XY080101,XY080127). (XY080101,XY080127)

计算机应用与软件

OA北大核心CSCDCSTPCD

1000-386X

访问量7
|
下载量0
段落导航相关论文