高技术通讯2025,Vol.35Issue(2):167-174,8.DOI:10.3772/j.issn.1002-0470.2025.02.006
Lite-IJformer:面向长序列Transformer的轻量化方法
Lite-IJformer:lite weight method for long sequence Transformers
摘要
Abstract
Aiming at the high computational complexity of long sequence Transformers,this paper proposes a lightweight method called Lite-IJformer.The core idea of the proposed method consists of two steps:(1)linearize the self-at-tention to reduce its computation complexity from quardratic to linear;(2)based on the low-rank matrix decompo-sition theory,reduce the dimension of KV matrix multiplication to further reduce the calculation scale.Experiments on long range arena(LRA)benchmark show that when the length of input sequences is 1 000-2 000,linearization can reduce the computational amount of self-attention by 13-26 times,and improve the inference speed by 4.75-5.72 times without precision loss.After dimension reduction,the computational amount of self-attention is further re-duced by 17.0%,and the inference speed of model is increased by 1.17 times,with a precision loss within 0.5%.关键词
Transformer/自注意力/线性化方法/降维Key words
Transformer/self-attention/linearization method/dimension reduction引用本文复制引用
连家诚,郝一帆,张曦珊,支天,孙广中..Lite-IJformer:面向长序列Transformer的轻量化方法[J].高技术通讯,2025,35(2):167-174,8.基金项目
国家重点研发计划(2022YFB4501601),国家自然科学基金(U22A2028,U20A20227)和中国科学院稳定支持基础研究领域青年团队计划(YSBR-029)资助项目. (2022YFB4501601)