| 注册
首页|期刊导航|计算机与数字工程|基于特征相似度的可比语料挖掘汉柬命名实体等价对

基于特征相似度的可比语料挖掘汉柬命名实体等价对

徐璐 严馨 夏青 周枫 莫源源

计算机与数字工程2017,Vol.45Issue(5):882-885,910,5.
计算机与数字工程2017,Vol.45Issue(5):882-885,910,5.DOI:10.3969/j.issn.1672-9722.2017.05.020

基于特征相似度的可比语料挖掘汉柬命名实体等价对

Chinese-Khmer Named Entity Equivalents Excavation Based on Feature Similarity in Comparable Corpus

徐璐 1严馨 2夏青 1周枫 2莫源源1

作者信息

  • 1. 昆明理工大学信息工程与自动化学院 昆明 650500
  • 2. 昆明理工大学智能信息处理重点实验室 昆明 650500
  • 折叠

摘要

Abstract

Named entity translation equivalent has been playing a significant role in the processing of cross-language informa?tion. However limited by the corpora resource,few in-depth studies have been made on the extraction of the bilingual Chi?nese-Khmer named entity equivalents. Starting from the comparable corpus text,according to the type of entity characteristics and comparable corpus characteristics,the paper selects transliteration feature,translation feature,context feature of the bilingual Chi?nese-Khmer named entity equivalents and length feature. So a method based on multi-feature fusion is proposed to calculate the sim?ilarity to excavate the bilingual Chinese-Khmer named entity equivalents. The experiment shows this method has a good perfor?mance when the bilingual Chinese-Khmer named entity equivalents are acquired through the computation of feature similarity,turn?ing out that the method proposed in this paper is able to give better effect compared with the method using only a single feature.

关键词

命名实体等价对/汉柬双语/多特征融合/可比语料/音译模型

Key words

named entity equivalents/Chinese-Khmer bilingual/multi-feature fusion/comparable corpus/transliteration model

分类

信息技术与安全科学

引用本文复制引用

徐璐,严馨,夏青,周枫,莫源源..基于特征相似度的可比语料挖掘汉柬命名实体等价对[J].计算机与数字工程,2017,45(5):882-885,910,5.

基金项目

基金资助:国家自然科学基金"柬埔寨语命名实体识别及汉柬双语语料库构建方法研究"(编号:61462055) (编号:61462055)

国家自然科学基金"基于篇章特征的越南语新闻事件元素抽取关键技术研究"(编号:61562049)资助. (编号:61562049)

计算机与数字工程

OACSTPCD

1672-9722

访问量0
|
下载量0
段落导航相关论文