数字图书馆论坛2023,Vol.19Issue(12):32-43,12.DOI:10.3772/j.issn.1673-2286.2023.12.004
基于机器学习的中国区块链专利技术主题识别与自动分类研究
Topic Recognition and Automatic Classification of Chinese Blockchain Patent Technology Based on Machine Learning
摘要
Abstract
The automatic recognition of technology topics in the field of blockchain and the automatic classification of technology topic categories provide intelligence support for expanding research and development topics in the field and promoting the development of the field.This paper takes the Chinese blockchain technology patents in the Derwent patent database as samples,designs and implements the blockchain technology topic recognition and automatic classification model based on machine learning,and realizes the blockchain technology topic recognition based on the LDA topic model.Based on the characteristic vector space of patent literature,a classification system for technology topic categories is formed,ultimately achieving automatic classification of blockchain technology topics based on traditional machine learning and deep learning models.The results show that the LDA topic model can effectively identify the topic categories in the blockchain technology field,and construct the characteristic vector space of the technology topic categories.18 technology topics are identified,which can be summarized as four topic categories according to the research direction:blockchain architecture research,blockchain industry application research,data storage and data security protection research,and high-tech application research.Through the cross-fusion of LDA topic model,traditional machine learning and deep learning,and other machine learning methods,we can effectively realize the automatic classification of technology topic categories in the domain.The classification results show that the performance of classification models such as support vector machine,LightGBM,LSTM,BP neural network,and logistic regression model is better.The accuracy rate is 84%-87%,and the precision rate is 79%-83%,among which the automatic classification effect of logistic regression model is more significant.关键词
LDA主题模型/机器学习/区块链/主题识别/自动分类Key words
LDA Topic Model/Machine Learning/Blockchain/Topic Recognition/Automatic Classification分类
社会科学引用本文复制引用
胡泽文,王梦雅,韩雅蓉..基于机器学习的中国区块链专利技术主题识别与自动分类研究[J].数字图书馆论坛,2023,19(12):32-43,12.基金项目
本研究得到国家社会科学基金项目"面向海量科技文献的潜在'精品'识别方法与应用研究"(编号:20CTQ031)、江苏高校"青蓝工程"资助. (编号:20CTQ031)