计算机应用研究2025,Vol.42Issue(7):2025-2031,7.DOI:10.19734/j.issn.1001-3695.2024.10.0437
针对图像指代分割的训练后量化策略
Post-training quantization strategy for referring image segmentation
摘要
Abstract
RIS aims to segment objects described by sentences in an image by integrating visual and linguistic information.This technique has strong application prospects in interactive image editing and language-guided human-machine interaction.However,existing solutions tend to explore high-performance models,neglecting practical applications on edge devices with limited resources.The paper proposed an efficient PTQ framework to address this challenge.Specifically,the analysis identi-fied the root cause of performance collapse caused by using the round-to-nearest(RTN)quantization method.Then the frame-work proposed a two-region balanced quantization strategy to solve the non-normal distribution of activation values after softmax and GELU operations in the visual encoder,and introduced a reordered grouping quantization strategy to tackle the quantiza-tion problems caused by outliers in the linear layers activation of the text encoder.Extensive experiments exploring different quantization bit widths on three benchmark datasets demonstrate the clear advantages of the proposed method over existing PTQ methods.As the first quantization scheme specifically for the RIS task,it confirms the feasibility of efficiently deploying the RIS model to edge devices using the PTQ method.关键词
图像指代分割/训练后量化/跨模态融合/深度学习Key words
referring image segmentation(RIS)/post-training quantization(PTQ)/cross-model fusion/deep learning分类
信息技术与安全科学引用本文复制引用
杨航,姜晓燕..针对图像指代分割的训练后量化策略[J].计算机应用研究,2025,42(7):2025-2031,7.基金项目
国家自然科学基金资助项目(U2033218) (U2033218)