| 注册
首页|期刊导航|自动化学报|邮件网络协同过滤机制研究

邮件网络协同过滤机制研究

杨震 赖英旭 段立娟 李玉鑑 许昕

自动化学报2012,Vol.38Issue(3):399-411,13.
自动化学报2012,Vol.38Issue(3):399-411,13.DOI:10.3724/SP.J.1004.2012.00399

邮件网络协同过滤机制研究

Spam Collaborative Filtering in Enron E-mail Network

杨震 1赖英旭 1段立娟 1李玉鑑 1许昕1

作者信息

  • 1. 北京工业大学计算机学院 北京100124
  • 折叠

摘要

Abstract

Social network analysis in Enron corpus found that the real e-mail network was a scale-free and small world in some degree. Then a spam collaborative filtering method was designed based on users' interaction. By adjusting the parameter A, users can decide filtering spam by themselves or others or trade-off between them. Even in the absence of reading habits of users, the collaborative filtering method could achieve good performance. Because the Enron corpus was unlabeled, by adding I.I.d. Assumption constraint to training data set W and test data set T, we labeled Enron corpus using improved EM (Expectation maximization) algorithm in a sense of minimum statistical risk in W ∪ T. Experiment results showed that the collaborative filtering method is simple and effective which can steadily increase average accuracy compared with single machine and ensemble filterings.

关键词

文本分类/邮件过滤/邮件网络/协同过滤

Key words

Text classification/spam filtering/e-mail network/collaborative filtering

引用本文复制引用

杨震,赖英旭,段立娟,李玉鑑,许昕..邮件网络协同过滤机制研究[J].自动化学报,2012,38(3):399-411,13.

基金项目

国家自然科学基金(61001178,60905017,61175115),国家软科学研究计划项目(2010GXQ5D317),北京市自然科学基金(4102012,4112009,4102013,4123093),北京市教育委员会科技发展计划面上项目(KM201210005024),北京市教育委员会科技发展计划重点项目(KZ201210005007),北京市高等学校人才强教深化计划“中青年骨干人才培养计划”项目(PHR201108016),北京工业大学高层人才培养项目,北京工业大学校青基金资助 (61001178,60905017,61175115)

自动化学报

OA北大核心CSCDCSTPCD

0254-4156

访问量0
|
下载量0
段落导航相关论文