计算机工程2019,Vol.45Issue(3):309-314,6.DOI:10.19678/j.issn.1000-3428.0050407
基于Word2vec的自然语言隐写分析方法
Natural Language Steganalysis Method Based on Word2vec
摘要
Abstract
In order to represent the semantic information of the text content for digitization and improve the accuracy of detecting stego texts based on synonym substitution, a novel natural language steganalyisis method is proposed.Word2 vec is employed to train a large-scale corpus to obtain multi-dimensional word vectors which contains rich semantic information.Then, it uses the cosine distance between a synonym and its context word vector to measure the correlation between two words, and calculates the fitness of synonyms in a specific context.According to the effect on the context fitness of the synonyms caused by the synonym substitutions in the embedding process, detection features are extracted to form a feature vector, and the Bayesian classification model is employed to train feature vector for the task of steganalysis feature to detect the stego texts.Experimental results show that the proposed method has good detection performance, whose average detection precision and average recall for the stego texts with different embedding rates achieve 97.71% and 92.64%, respectively.关键词
自然语言/词向量/同义词替换/隐写分析/上下文合适度Key words
natural language/word vector/synonym substitution/steganalysis/context fitness分类
信息技术与安全科学引用本文复制引用
喻靖民,向凌云,曾道建..基于Word2vec的自然语言隐写分析方法[J].计算机工程,2019,45(3):309-314,6.基金项目
国家自然科学基金(61202439,61602059) (61202439,61602059)
湖南省教育厅科学研究重点项目(16A008). (16A008)