| 注册
首页|期刊导航|信号处理|大模型逐像素预测赋能的图像语义通信:一种分离信源信道编码的视角

大模型逐像素预测赋能的图像语义通信:一种分离信源信道编码的视角

任天骐 李荣鹏

信号处理2025,Vol.41Issue(10):1657-1669,13.
信号处理2025,Vol.41Issue(10):1657-1669,13.DOI:10.12466/xhcl.2025.10.006

大模型逐像素预测赋能的图像语义通信:一种分离信源信道编码的视角

LVM-Empowered Image Semantic Communication via Next-pixel Prediction:A Separate Source-Channel Coding Perspective

任天骐 1李荣鹏1

作者信息

  • 1. 浙江大学信息与电子工程学院,浙江 杭州 310027
  • 折叠

摘要

Abstract

As the vision for 6G unfolds,semantic communication is emerging as a core technology.The prevailing para-digm,deep learning-based joint source-channel coding(JSCC),performs well under specific conditions but is ham-pered by inherent limitations such as poor compatibility with digital systems,weak generalization,and low design flex-ibility.To address these challenges,this study revisits the separate source-channel coding(SSCC)paradigm and pro-poses the large visual model-based separate source-channel coding framework(LVM-SSCC).This framework innova-tively leverages large vision models(e.g.,ImageGPT)for autoregressive pixel prediction,which,combined with arith-metic coding,achieves highly efficient lossless source compression.Concurrently,an error correction code transformer(ECCT)is introduced on the channel-coding side to enhance the low-density parity-check(LDPC)decoding robust-ness.To ensure a fair comparison,this study utilized a unified energy consumption-based signal-to-noise ratio(SNRunified)metric.Extensive simulations on the CIFAR-10 dataset demonstrated that under both additive white Gaussian noise(AWGN)and Rayleigh fading channels,the proposed scheme significantly outperformed mainstream JSCC schemes such as DeepJSCC and SparseSBC in terms of the image reconstruction quality(peak signal-to-noise ratio(PSNR)and structural similarity index(SSIM)).This was especially true in the mid-to-high SNR region,where our scheme achieved near-lossless reconstruction with high fidelity while maintaining full compatibility with digital commu-nication systems.The results of this study provide compelling evidence of the benefits of using the SSCC paradigm in fu-ture image semantic communication,highlighting its comprehensive advantages in performance,compatibility,and flexibility.

关键词

语义通信/无损图像传输/分离信源信道编码(SSCC)/大型视觉模型(LVM)/纠错码Transformer(ECCT)

Key words

semantic communication/lossless image transmission/separate source-channel coding(SSCC)/large vision model(LVM)/error correction code Transformer(ECCT)

分类

信息技术与安全科学

引用本文复制引用

任天骐,李荣鹏..大模型逐像素预测赋能的图像语义通信:一种分离信源信道编码的视角[J].信号处理,2025,41(10):1657-1669,13.

基金项目

国家重点研发计划项目课题(2024YFE0200602) (2024YFE0200602)

浙江省自然科学基金(LR23F010005) (LR23F010005)

华为技术有限公司合作课题(TC20240829036) Ministry of Science and Technology of China(2024YFE0200602) (TC20240829036)

Zhejiang Provincial Science Foundation of China(LR23F010005) (LR23F010005)

Huawei Technology Co.,Ltd(TC20240829036) (TC20240829036)

信号处理

OA北大核心

1003-0530

访问量0
|
下载量0
段落导航相关论文