计算机与数字工程2019,Vol.47Issue(5):1170-1174,1228,6.DOI:10.3969/j.issn.1672-9722.2019.05.030
基于关键词加权的法律文本主题模型研究
Research on Topic Model of Legal Texts Based on Keyword Weighting
摘要
Abstract
In order to reduce dimensionality of legal text and remove irrelevant words in the legal text classification,the topic model is used to establish a text classification model based on legal term weighting. According to the keywords difference of different categories of legal texts,a keywords marked distribution from words to topics is proposed in the topic model. And then learning for weights is carried out,weights are used to update the distribution of documents to topics,thereby improving the accuracy of calcula?tion on document similarity. Compared with the traditional topic model,the weighted topic model can get better perplexity and text similarity on the Westlaw database.关键词
主题模型/法律文本/关键词/加权/困惑度Key words
topic model/legal text/keywords/weighting/perplexity分类
信息技术与安全科学引用本文复制引用
张扬武,李国和,王立梅..基于关键词加权的法律文本主题模型研究[J].计算机与数字工程,2019,47(5):1170-1174,1228,6.基金项目
国家科技重大专项项目(编号:2018YFC0831202) (编号:2018YFC0831202)
国家自然科学基金项目(编号:60473125) (编号:60473125)
中国石油大学(北京)克拉玛依校区科研启动基金(编号:RCYJ2016B-03-001)资助. (北京)