| 注册
首页|期刊导航|重庆理工大学学报(自然科学版)|基于 SNM 改进算法的相似重复记录消除

基于 SNM 改进算法的相似重复记录消除

余肖生 胡孙枝

重庆理工大学学报(自然科学版)2016,Vol.30Issue(4):91-96,6.
重庆理工大学学报(自然科学版)2016,Vol.30Issue(4):91-96,6.DOI:10.3969/j.issn.1674-8425(z).2016.04.016

基于 SNM 改进算法的相似重复记录消除

Research on Eliminating Duplicate Records Based on SNM Improved Algorithm

余肖生 1胡孙枝1

作者信息

  • 1. 三峡大学 计算机与信息学院,湖北 宜昌 443002
  • 折叠

摘要

Abstract

High quality data is the most important factor to build the data warehouse. The low quality data may be bad for decision maxing. An approximately duplicate record from different data sources is one of the main data quality issues to build data warehouse. To eliminate approximately duplicate data as far as possible before the source data enters into a data warehouse can greatly improve the quality of data. Firstly,the existing approximately duplicate records elimination algorithms were compared,and then SNM algorithm was improved. The authors compared traditional SNM method and SNM improved algorithm by the experiment,and the results show:SNM improved algorithm has obvious advantages in eliminating duplicate records.

关键词

SNM算法/SNM改进算法/相似重复记录消除

Key words

SNM algorithm/SNM improved algorithm/approximately duplicate records elimination

分类

信息技术与安全科学

引用本文复制引用

余肖生,胡孙枝..基于 SNM 改进算法的相似重复记录消除[J].重庆理工大学学报(自然科学版),2016,30(4):91-96,6.

基金项目

国家自然科学基金资助项目 ()

重庆理工大学学报(自然科学版)

OACSTPCD

1674-8425

访问量0
|
下载量0
段落导航相关论文