山西大学学报(自然科学版)2024,Vol.47Issue(3):518-527,10.DOI:10.13451/j.sxu.ns.2024003
SSHGCN:基于音形异构图卷积的中文纠错方法
SSHGCN:A Chinese Error Correction Method Based on Heterogeneous Graph Convolution with Phonological and Visual Features
摘要
Abstract
Chinese spelling correction aims to detect and correct spelling errors in Chinese text.Existing methods have attempted to model character similarity as graph structure information.However,the graph structure of current methods ignores the deep phonetic proximity among Chinese characters and lacks a multimodal information fusion method that fully exploits the role of character sound and shape.Therefore,this paper obtains the phonetic similarity relationship based on the initial and final information of Chi-nese characters and the importance of pinyin,and combines the shape proximity relationship of Chinese characters to construct a Chinese character similar pinyin-shape proximity heterogeneous graph.The heterogeneous graph convolution is used on this graph to complement the use of the sound and shape information of Chinese characters,and fully integrate the tone and shape information of Chinese characters.This method surpasses all comparison methods in terms of sentence-level F1 score on the SIGHAN15 bench-mark,and is comparable to the best comparison method on the SIGHAN13 benchmark,verifying the effectiveness of this method.关键词
中文拼写纠错/多模态信息融合方法/字符相似性/拼音相似关系Key words
Chinese spelling correction/multimodal information fusion method/character similarity/pinyin similarity分类
信息技术与安全科学引用本文复制引用
任俊,黄瑞章..SSHGCN:基于音形异构图卷积的中文纠错方法[J].山西大学学报(自然科学版),2024,47(3):518-527,10.基金项目
国家自然科学基金(62066007) (62066007)
贵州省科技支撑计划项目(2022277) (2022277)