首页|期刊导航|安徽农业大学学报|基于Vision Transformer-LSTM(ViTL)的多时序遥感影像农作物分类方法

基于Vision Transformer-LSTM(ViTL)的多时序遥感影像农作物分类方法

张青云杨辉李兴伍武永闯

安徽农业大学学报2024，Vol.51Issue(5)：888-898,11.

安徽农业大学学报2024，Vol.51Issue(5)：888-898,11.DOI:10.13610/j.cnki.1672-352x.20241101.002

基于Vision Transformer-LSTM(ViTL)的多时序遥感影像农作物分类方法

A crop classification method based on vision Transformer-LSTM(ViTL)for multi-temporal remote sensing images

张青云 ¹杨辉 ²李兴伍 ³武永闯³

作者信息

1. 安徽大学资源与环境工程学院,合肥 230601
2. 安徽大学物质科学与信息技术研究院,合肥 230601
3. 安徽大学人工智能学院,合肥 230601
折叠

摘要

Abstract

For the current research in remote sensing crop classification using deep learning models,there exists an inadequacy in sampling temporal and spatial information features,leads to challenges in accurately extracting crops due to issues such as boundary fuzziness,omissions,and misclassifications.In this study,we propose a deep learning model called Vision Transformer-long short term memory(ViTL)to addresses these challenges.The ViTL model integrated three key modules such as two-way Vit-Transformer feature extraction,spatio-temporal feature fusion,and long short-term memory(LSTM)temporal classification,and the two-way Vit-Transformer feature ex-traction module was used to capture the spatio-temporal feature correlation of the image,one way to extract spatial classification features and one way to extract temporal change features;spatio-temporal feature fusion module was used to cross-fertilize multi-temporal feature information,and the LSTM temporal classification module captures multi-temporal dependencies and performs output classification.In this study,the theory and method of remote sensing technology based on multi-temporal satellite images were comprehensively utilized to extract crop infor-mation in Nehe city,Qiqihar city,Heilongjiang Province.The results showed that,the ViTL model performed well,with its overall accuracy(OA),a mean intersection over union(MIoU)and F1 scores reached 0.867 6,0.698 7 and 0.8175,respectively,and the F1 scores of the ViTL model were improved by 9％-12％compared with other widely used deep learning methods,including 3-D convolutional neural networks(3-D CNNs),2-D convolutional neural networks(2-D CNNs),and long short-term memory recurrent neural networks(LSTMs),which shows significant superiority.The ViTL model provides a new idea for accurate and efficient crop classification.

关键词

农作物分类/Vision Transformer(ViT)/LSTM/深度学习/遥感监测

Key words

crop classification/vision transformer(ViT)/LSTM/deep learning/remote sensing monitoring

分类

农业科技

引用本文复制引用

张青云,杨辉,李兴伍,武永闯..基于Vision Transformer-LSTM(ViTL)的多时序遥感影像农作物分类方法[J].安徽农业大学学报,2024,51(5):888-898,11.

基金项目

国家自然科学基金(42101381)和安徽省重点研发国际合作项目(202104b11020022)共同资助. （42101381）

安徽农业大学学报

OACSTPCD

ISSN：1672-352X

访问量0

下载量0

段落导航