| 注册
首页|期刊导航|计算机应用与软件|基于改进的隐马尔可夫模型在网页信息抽取中的研究与应用

基于改进的隐马尔可夫模型在网页信息抽取中的研究与应用

双哲 孙蕾

计算机应用与软件2017,Vol.34Issue(2):42-47,6.
计算机应用与软件2017,Vol.34Issue(2):42-47,6.DOI:10.3969/j.issn.1000-386x.2017.02.007

基于改进的隐马尔可夫模型在网页信息抽取中的研究与应用

RESEARCH AND APPLICATION FOR WEB INFORMATION EXTRACTION BASED ON IMPROVED HIDDEN MARKOV MODEL

双哲 1孙蕾1

作者信息

  • 1. 华东师范大学计算机科学技术系 上海200241
  • 折叠

摘要

Abstract

The task of information extraction is to obtain the objective information precisely and quickly from a large scale of data and improve the utilization of information.According to the characteristics of web data,an improved hidden Markov model (HMM) for web information extraction is proposed,which means combining the advantage of maximum entropy (ME) model in the representation of feature knowledge.The backward dependency assumption in the HMM is added and the model parameters are adjusted by using the characteristic of the emission unit.The state transition probability and the output probability of the improved HMM are not only dependent on the current state of the model,but also be corrected by the forward and backward state values of the historical state of the model.The experimental results show that applying the improved HMM model to web information extraction can effectively improve the quality of web information extraction.

关键词

隐马尔可夫模型/最大熵模型/网页信息抽取

Key words

Hidden markov model/Maximum entropy model/Web information extraction

分类

信息技术与安全科学

引用本文复制引用

双哲,孙蕾..基于改进的隐马尔可夫模型在网页信息抽取中的研究与应用[J].计算机应用与软件,2017,34(2):42-47,6.

基金项目

国家自然科学基金项目(61502170). (61502170)

计算机应用与软件

OA北大核心CSTPCD

1000-386X

访问量0
|
下载量0
段落导航相关论文