计算机应用研究2017,Vol.34Issue(8):2269-2273,5.DOI:10.3969/j.issn.1001-3695.2017.08.006
LSI_LDA:一种混合特征降维方法
LSI_LDA:mixture method for feature dimensionality reduction
摘要
Abstract
The LDA method does not take the input space into consideration effectively when making topic label to each word in the original space.As the original input holds the non-action terms,which affects the topic distribution extremely and reduces the classification accuracy.In order to remedy this imperfection,this paper proposed a new LSI_LDA algorithm.Firstly,LSI model mapped the input space to the latent semantic space.Secondly, it extracted the key features in accordance with their semantic relation.Finally,LDA model could perfectly performed on a simpler and more pertinent space.The classification accuracy was improved by 1.50% using the proposed method than that using LDA alone with Fudan University corpus.This experimental result shows that the LSI_LDA has a higher performance in text categorization.关键词
文本分类/特征降维/潜在语义索引/潜在狄利克雷分配Key words
text categorization/feature dimensionality reduction/latent semantic index(LSI)/latent Dirichlet allocation(LDA)分类
信息技术与安全科学引用本文复制引用
史庆伟,从世源,唐晓亮..LSI_LDA:一种混合特征降维方法[J].计算机应用研究,2017,34(8):2269-2273,5.基金项目
国家自然科学基金青年科学基金资助项目(61401185) (61401185)
辽宁省教育厅科学研究一般项目(L2013133) (L2013133)