首页|期刊导航|统计与决策|一种面向海量数据的spilt-and-conquer方法

一种面向海量数据的spilt-and-conquer方法

温焜兰晓然

统计与决策2018，Vol.34Issue(16)：74-76,3.

统计与决策2018，Vol.34Issue(16)：74-76,3.DOI:10.13546/j.cnki.tjyjc.2018.16.018

一种面向海量数据的spilt-and-conquer方法

A Spilt-and-Conquer Method for Massive Data

温焜 ¹兰晓然²

作者信息

1. 南昌大学管理学院,南昌 330029
2. 江西行政学院,南昌 330003
折叠

摘要

Abstract

Lasso has been widely applied as one good method for variable selection. But for the high-dimensional massive data sets, there will be too much computer consumption. In view of this situation, this paper proposes the spilt-and-conquer method, in which the high-dimensional data sets are divided into K parts, and then variables are selected to merge each selected feature set before selecting variables. In order to verify the superiority of the proposed method, the paper uses six sets of data for experiments. Finally, the paper employs SVM, random forest and neural network to make a prediction, which shows that the spilt-and-conquer method has good performance in processing high dimensional mass data and also saves running time to a great extent.

关键词

spilt-and-conquer方法/变量选择/高维数据

Key words

spilt-and-conquer method/variable selection/high-dimensional data

分类

数理科学

引用本文复制引用

温焜,兰晓然..一种面向海量数据的spilt-and-conquer方法[J].统计与决策,2018,34(16):74-76,3.

统计与决策

OA北大核心CHSSCDCSSCICSTPCD

ISSN：1002-6487

访问量0

下载量0

段落导航