| 注册
首页|期刊导航|中国科学数据(中英文网络版)|深成侵入岩类不平衡岩石图像数据集PlutonicRocks-13

深成侵入岩类不平衡岩石图像数据集PlutonicRocks-13

陈忠良 胡召齐 郑超杰

中国科学数据(中英文网络版)2026,Vol.11Issue(1):3-18,16.
中国科学数据(中英文网络版)2026,Vol.11Issue(1):3-18,16.DOI:10.11922/11-6035.nbsdc.2025.0210.zh

深成侵入岩类不平衡岩石图像数据集PlutonicRocks-13

PlutonicRocks-13:A dataset of class-imbalanced images of plutonic rocks

陈忠良 1胡召齐 1郑超杰2

作者信息

  • 1. 安徽省地质调查院(安徽省地质科学研究所),合肥 230001
  • 2. 合肥工业大学资源与环境工程学院,合肥 230009
  • 折叠

摘要

Abstract

Lithology recognition is one of the fundamental skills for geologists.With the rise of artificial intelligence(AI),a fundamental challenge and opportunity in geosciences is translating expert geological knowledge into AI models capable of delivering intelligent lithological recognition services,enabling geoscience enthusiasts or non-geologists to more accurately identify rock types.In natural environments,the spatial distribution of surface rocks is highly heterogeneous,resulting in rock image datasets that typically exhibits a long-tailed distribution.Taking plutonic rocks as an example,this study adopts the classification and nomenclature scheme from the textbook Petrology(edited by Yu Bingsong et al.),and introduces PlutonicRocks-13,an imbalanced dataset for rock image recognition.The dataset comprises 13 common types of plutonic rocks,containing a total of 4,785 images with a data size of 2.49 GB.The rock types represented in this dataset are:olivine,pyroxenite,hornblendite,gabbro,diorite,monzonite,syenite,nepheline syenite,granodiorite,monzogranite,syenogranite,plagiogranite and graphic granite.Rock images were primarily collected from field outcrops and hand specimens from museums,supplemented by online sources.After careful screening,processing,and annotation,these images were curated into PlutonicRocks-13,a dataset tailored for rock image classification.To ensure annotation quality,quality control and evaluation procedures were applied,including thin-section petrographic verification and bias detection based on explainable deep learning techniques.Furthermore,by converting annotated labels into question-answer pairs,this dataset can be used for instruction tuning of multimodal models,enabling them to perform rock image classification through natural language instructions.This image dataset provides reliable data support for research on automated rock image recognition and holds significant reference value for geological surveys,surficial substrate investigations,and public geoscience education.

关键词

岩浆岩/侵入岩/长尾分布/类不平衡/图像分类

Key words

igneous rock/intrusive rock/long-tailed distribution/class imbalance/image classification

引用本文复制引用

陈忠良,胡召齐,郑超杰..深成侵入岩类不平衡岩石图像数据集PlutonicRocks-13[J].中国科学数据(中英文网络版),2026,11(1):3-18,16.

基金项目

国家自然科学基金(42372342,42202328) National Natural Science Foundation of China(42372342,42202328). (42372342,42202328)

中国科学数据(中英文网络版)

2096-2223

访问量0
|
下载量0
段落导航相关论文