| 注册
首页|期刊导航|南京大学学报:自然科学版|基于MapReduce的高铁振动数据预处理

基于MapReduce的高铁振动数据预处理

赵成兵 李天瑞 王仲刚 高子喆

南京大学学报:自然科学版2012,Vol.48Issue(4):390-396,7.
南京大学学报:自然科学版2012,Vol.48Issue(4):390-396,7.

基于MapReduce的高铁振动数据预处理

MapReduce based preprocessing on vibration data of high speed rail

赵成兵 1李天瑞 1王仲刚 1高子喆1

作者信息

  • 1. 西南交通大学信息科学与技术学院,成都610031
  • 折叠

摘要

Abstract

Analyzing the high speed rail data and obtaining its operational states are vital to guarantee the safety of rail transportation. Vibration data is one kind of them. Vibration data is obtained by sampling with multiple sensors in a fixed frequency like 2500Hz. The volume of vibration data wilI be Gigabytes if a testing experiment lasts 1 or 2 days. Before data analysis, the vibration data preprocessing is dispensable. It includes erasing outliers and linear trend removal, etc. Erasing outliers means that we firstly decide and locate the outliers in the data file using common rules, and then we replace the outliers by using its 4 neighbor data values. Linear trend removal means we need to remove the offset since there is a linear offset in the raw data due to the test equipment. Traditional methods for processing vibration data become inefficient since they process the data files one by one serially. The processing time is long and insufferable. Moreover they cannot deal with big size files due to the limitation of memory. Then theyare forced to randomly sample the raw data and only analyze the small part data. Clearly it may lose some important information in vibration data. This paper aims to improve the efficiency of preprocessing vibration data. Cloud Computing has received much attention with idea of sharing computing capabilities and cooperatively working. Based on the analysis of the preprocessing methods of vibration data and the MapReduce architecture in cloud computing, the parallel methods of the preprocessing vibration data including erasing outliers and linear trend removal are accomplished. These methods are implemented on Hadoop platform. Experiments are designed to verify the effectiveness and the parallel consistency. We conduct performance experiments on Hadoop clusters with 6 six nodes (1 Master and 5 Slaves). The results show that the proposed methods can deal with the big-size file and improve the processing efficiency. Moreover, the experimental results on three parallel performance indexes, Speedup, Scaleup and Sizeup, demonstrate the advantage of our methods.

关键词

并行化/MapReduce/高铁/振动/预处理

Key words

parallel/mapreduee/high speed rail/vibration/preprocessing

分类

计算机与自动化

引用本文复制引用

赵成兵,李天瑞,王仲刚,高子喆..基于MapReduce的高铁振动数据预处理[J].南京大学学报:自然科学版,2012,48(4):390-396,7.

基金项目

“十一五”国家科技支撑计划 ()

南京大学学报:自然科学版

OACSCDCSTPCD

0469-5097

访问量0
|
下载量0
段落导航相关论文