软件导刊2025,Vol.24Issue(3):31-36,6.DOI:10.11907/rjdk.241671
基于双路TCN的长短距离融合学习转录因子结合位点预测
Transcription Factor Binding Sites Prediction with Long-Short Distance Fusion Learning Based on Dual-path TCN
摘要
Abstract
Accurate identification of transcription factor binding sites(TFBSs)is crucial for understanding gene expression and regulatory mechanisms.Convolutional Neural Network(CNN)and Long Short-Term Memory(LSTM)models have significantly improved accuracy in this task compared to traditional machine learning approaches.However,CNNs specialize in learning local spatial features but ignore long-dis-tance dependencies in DNA sequences,while LSTMs are proficient in learning sequential relationships but computationally inefficient due to a lack of parallel computing ability.This paper proposes a novel dual-path sequential network integrating long-short distance fusion learning to address the above issues.In terms of structure,this paper employs the Temporal Convolutional Network(TCN)as the feature extractor that supports sequential modeling and parallel processing.The dual-path structure can learn complementary DNA features,improving the learning stability.In terms of features,this paper leverages the context information modeling capability of TCN,and designs a long-short distance fu-sion learning strategy to strengthen the feature representation for prediction.The experiment results on 165 ChIP-seq datasets show that our method outperforms the popular deep learning based methods.This study introduces a valuable framework for TFBSs prediction by combining sequential features with different distance dependency information.关键词
转录因子/结合位点/序列数据处理/时间卷积网络/特征融合Key words
transcription factor/binding site/sequence data processing/temporal convolutional network/feature fusion分类
信息技术与安全科学引用本文复制引用
吴志强,宋佳智,姜静清,罗蕊..基于双路TCN的长短距离融合学习转录因子结合位点预测[J].软件导刊,2025,24(3):31-36,6.基金项目
国家自然科学基金项目(62162050) (62162050)
内蒙古民族大学博士科研启动基金项目(KYQD23006,BS672) (KYQD23006,BS672)
内蒙古自然科学基金项目(2021BS03036) (2021BS03036)
蓖麻产业技术创新内蒙古自治区工程研究中心开放课题(MDK2021004,MDK2023012) (MDK2021004,MDK2023012)
内蒙古自治区蓖麻产业协同创新中心开放课题(MDK2022016) (MDK2022016)