| 注册
首页|期刊导航|信号处理|基于序贯检测的快速马尔可夫决策:理论、方法及应用

基于序贯检测的快速马尔可夫决策:理论、方法及应用

陈祖旭 陈巍 李长坤 韩宇星

信号处理2025,Vol.41Issue(3):448-471,24.
信号处理2025,Vol.41Issue(3):448-471,24.DOI:10.12466/xhcl.2025.03.004

基于序贯检测的快速马尔可夫决策:理论、方法及应用

Sequential Detection Based Quickest Markov Decision Processes:Theory,Algorithms,and Applications

陈祖旭 1陈巍 2李长坤 3韩宇星1

作者信息

  • 1. 清华大学深圳国际研究生院,广东 深圳 518071
  • 2. 清华大学电子工程系,北京 100084||天基网络与通信全国重点实验室,北京 100084||北京信息科学与技术国家研究中心,北京 100084
  • 3. 清华大学电子工程系,北京 100084
  • 折叠

摘要

Abstract

In this paper,joint signal processing and control methods for complex dynamical systems with statistically change point,observation noise,aftereffects,and action latency were investigated to maximize the overall utility of delay-sensitive decision making.A unified framework combining the quickest change detection in statistical signal pro-cessing and the Markov decision process in stochastic optimal control was presented along with its potential applications in smart grid,disease control,and hydrology.By leveraging a four-dimensional constrained Markov decision process,the proposed framework maximized the expected reward characterized by the weighted sum of the income and risk,while satisfying various constraints due to operations,feasibility,and environments.In contrast to the conventional lay-ered infrastructure in which an action is launched after the change point is detected,the new architecture enabled a cross-layer cross-disciplinary collaboration between signal processing and control,which implemented real-time decisions much timelier based on instantaneous likelihood estimation.The paradigm-shift idea brought substantial gain for dynami-cal or stochastic systems that are sensitive to the latency in decision or control,while suffering from huge detection de-lay and/or strong aftereffects.It was demonstrated that the joint detection and control strategy outperformed the control-after-detection policy in smart grid,disease control,and hydrology with considerable gain observed.Finally,we briefly envisioned the potential applications of sequential detection based quickest Markov decision processes in carbon capture and storage in the seafloor as well as network attack detection and mitigation.

关键词

统计信号处理/随机最优控制/序贯检测/最速变化点检测/马尔可夫决策过程/受约束马尔可夫决策过程

Key words

statistical signal processing/stochastic optimal control/sequential detection/quickest change detection/Markov decision process/constrained Markov decision process

分类

社会科学

引用本文复制引用

陈祖旭,陈巍,李长坤,韩宇星..基于序贯检测的快速马尔可夫决策:理论、方法及应用[J].信号处理,2025,41(3):448-471,24.

基金项目

国家自然科学基金(62261160390,62471276) (62261160390,62471276)

深圳市启动经费项目(QD2023014C) (QD2023014C)

美团科研基金 The National Natural Science Foundation of China(62261160390,62471276) (62261160390,62471276)

The Shenzhen Startup Funding(QD2023014C) (QD2023014C)

Meituan Funding ()

信号处理

OA北大核心

1003-0530

访问量0
|
下载量0
段落导航相关论文