计算机工程2018,Vol.44Issue(1):69-73,78,6.DOI:10.3969/j.issn.1000-3428.2018.01.011
一种改进的Supervised-LDA文本模型及其应用
An Improved Supervised-LDA Text Model and Its Application
摘要
Abstract
Supervised-Latent Dirichlet Distribution Allocation (s-LDA) model cannot handle the multi-label problem and s-LDA model is not correct distribution in the classification model.The Supervised Labled-LDA (sl-LDA) model is proposed by adding a category label based on the response variable.It analyses s-LDA model and existed problem of topic classification,through verifying the classification accuracy of sl-LDA model,the paper classifies the sl-LDA model and s-LDA model.Experimental results in the Chinese and English news corpus show that English corpus classification performance is improved by about 3.80% and Chinese corpus is improved by about 1.77%.关键词
s-LDA模型/响应变量/新闻主题/主题模型/类别标签Key words
s-LDA model/response variable/news topic/topic model/category label分类
信息技术与安全科学引用本文复制引用
许腾腾,黄恒君..一种改进的Supervised-LDA文本模型及其应用[J].计算机工程,2018,44(1):69-73,78,6.基金项目
国家社会科学基金青年项目“基于大数据整合的空气质量测度方法研究”(14CTJ009) (14CTJ009)
陇原青年创新人才扶持计划项目“基于大数据整合的‘废旧数据’应用研究”(14GSD95) (14GSD95)
甘肃省财政厅高校基本科研业务费专项基金“大数据整合下的统计调查技术及其经济应用研究”(GZ14007) (GZ14007)
全国统计科学研究重点项目“网络大数据统计生产及其数据质量控制研究”(2017LZ43). (2017LZ43)