| 注册
首页|期刊导航|农业机械学报|多模态作物表型数据分布式存取方法研究

多模态作物表型数据分布式存取方法研究

HAO Zichao ZHAO Xiangyu PAN Shouhui LIU Dongming WANG Kaiyi

农业机械学报2026,Vol.57Issue(1):51-61,11.
农业机械学报2026,Vol.57Issue(1):51-61,11.DOI:10.6041/j.issn.1000-1298.2026.01.005

多模态作物表型数据分布式存取方法研究

Distributed Access Method for Multimodal Crop Phenotypic Data

HAO Zichao 1ZHAO Xiangyu 2PAN Shouhui 3LIU Dongming 4WANG Kaiyi2

作者信息

  • 1. College of Information and Electrical Engineering,Shenyang Agricultural University,Shenyang 110866,China||Information Technology Research Center,Beijing Academy of Agriculture and Forestry Sciences,Beijing 100097,China
  • 2. Information Technology Research Center,Beijing Academy of Agriculture and Forestry Sciences,Beijing 100097,China
  • 3. Information Technology Research Center,Beijing Academy of Agriculture and Forestry Sciences,Beijing 100097,China||Beijing PAIDE Science and Technology Development Co.,Ltd.,Beijing 100097,China
  • 4. Beijing PAIDE Science and Technology Development Co.,Ltd.,Beijing 100097,China
  • 折叠

摘要

Abstract

The rapid development of high-throughput crop phenotyping acquisition equipment has provided modern data collection means for breeding and cultivation research,while spawning massive multi-modal and unstructured phenotypic data.Traditional structured data storage models can no longer meet the efficient access requirements of such data.A hybrid access framework was proposed based on distributed technology,which used HBase and HDFS to build a structured and unstructured fusion storage engine,integrated client-side cache and Redis cache to design an efficient retrieval mechanism,and optimized core issues:aiming at the inherent defects of native HDFS in storing phenotypic data,a modal aggregation-based MCH storage framework was designed.By classifying and merging phenotypic data according to modalities and constructing local indexes by using double-layer hashing technology,it effectively reduced NameNode memory pressure while improving access efficiency and storage space utilization of single-modal data.For high-concurrency data reading scenarios,a double-layer cache mechanism based on data popularity was constructed.It optimized hot data reading efficiency through metadata hierarchical caching and innovatively proposed a data popularity evaluation model combining access frequency and time characteristics,which effectively improved cache hit rate.Experimental results showed that when the data scale was 1.0×105,the proposed distributed access method reduced the NameNode memory occupancy rate by 31.2%compared with the optimal native solution(SequenceFile),and the retrieval time by 25.4%compared with the optimal native solution(MapFile),providing technical support for the storage and retrieval of massive multi-modal phenotypic data.

关键词

多模态作物表型数据/分布式存取/文件合并/双层缓存机制

Key words

multimodal crop phenotypic data/distributed access/file merging/two-level cache mechanism

分类

信息技术与安全科学

引用本文复制引用

HAO Zichao,ZHAO Xiangyu,PAN Shouhui,LIU Dongming,WANG Kaiyi..多模态作物表型数据分布式存取方法研究[J].农业机械学报,2026,57(1):51-61,11.

基金项目

国家重点研发计划项目(2022YFD2002303-03)和北京市乡村振兴项目(NY2401040425) (2022YFD2002303-03)

农业机械学报

1000-1298

访问量1
|
下载量0
段落导航相关论文