首页|期刊导航|计算机应用与软件|基于机器学习的多语言文本抽取系统实现

基于机器学习的多语言文本抽取系统实现

曾军周国富

计算机应用与软件2017，Vol.34Issue(4)：87-92,156,7.

计算机应用与软件2017，Vol.34Issue(4)：87-92,156,7.DOI:10.3969/j.issn.1000-386x.2017.04.016

基于机器学习的多语言文本抽取系统实现

IMPLEMENTATION OF MULTI-LANGUAGE TEXT INFORMATION EXTRACTION SYSTEM BASED ON MACHINE LEARNING

曾军 ¹周国富¹

作者信息

1. 武汉大学软件工程国家重点实验室湖北武汉 430072
折叠

摘要

Abstract

The method of information extraction based on statistical machine learning is becoming a hot research topic day by day.Although there are some practical frameworks and systems for text information extraction based on machine learning, most of them face weaknesses such as weak interactivity, low scalability and poor language transplanting ability.To solve this problem, a universal and feasible information extraction framework based on multi-language is proposed and implemented, and a prototype system is implemented.The prototype system integrates the maximum entropy and support vector machines, and the two algorithms are used to verify the practicability of the system in both English and Chinese texts.

关键词

统计机器学习/信息抽取/多语言/最大熵模型/支持向量机

Key words

Statistical machine learning/Information extraction/Multi-language/ME/SVM

分类

信息技术与安全科学

引用本文复制引用

曾军,周国富..基于机器学习的多语言文本抽取系统实现[J].计算机应用与软件,2017,34(4):87-92,156,7.

计算机应用与软件

OA北大核心CSTPCD

ISSN：1000-386X

访问量0

下载量0

段落导航