广西民族大学学报(自然科学版)2024,Vol.30Issue(1):91-98,8.
面向自然语言处理的现代缅文分词规范研制与应用
Research and Application of Modern Burmese Word Segmentation Scheme for Natural Language Processing
摘要
Abstract
Burmese word segmentation is one of the indispensable basic tasks in Burmese language natural language processing,and word segmentation specification is the key problem in the research of automatic word segmentation.By referring to the experience of word segmentation in Chinese,Tibetan and other languages,and combining the characteristics of Burmese,the coding characteristics of Burmese in computer and the grammar of Burmese,the paper put forward to a set of relatively systematic word segmentation scheme suitable for modern Burmese;Based on this scheme,the Burmese open-source manual label word segmentation corpus is re-labeled.The experimental results show that the performance of this word segmentation scheme is better under the condition of six common word segmentation algorithm.关键词
缅甸/自然语言处理/现代缅文/分词规范Key words
Myanmar/Natural Language Processing/Modern Burmese/Word Segmentation Scheme分类
信息技术与安全科学引用本文复制引用
陈宇,秦董洪,张慧,张啸岩,杨国影,欧江玲,庞俊彩..面向自然语言处理的现代缅文分词规范研制与应用[J].广西民族大学学报(自然科学版),2024,30(1):91-98,8.基金项目
国家自然科学基金资助项目(61462009,61862007) (61462009,61862007)
广西自然科学基金资助项目(2018GXNSFAA281269) (2018GXNSFAA281269)
广西研究生教育创新计划项目(YCSW2023268) (YCSW2023268)
广西民族大学教改项目(2021XJGY10). (2021XJGY10)