| 注册
首页|期刊导航|华东师范大学学报(自然科学版)|树木倒伏场景中多模态大模型的应用挑战与优化研究

树木倒伏场景中多模态大模型的应用挑战与优化研究

冯雷 缪思好 李超楠 盛春杰 施宇星 黄奕铖 金剑虹 许韵 杜聿洲 周妮娜

华东师范大学学报(自然科学版)Issue(5):53-65,13.
华东师范大学学报(自然科学版)Issue(5):53-65,13.DOI:10.3969/j.issn.1000-5641.2025.05.006

树木倒伏场景中多模态大模型的应用挑战与优化研究

Research on challenges and optimization of large multimodal model applications in treefall scenarios

冯雷 1缪思好 1李超楠 1盛春杰 2施宇星 2黄奕铖 1金剑虹 1许韵 1杜聿洲 1周妮娜1

作者信息

  • 1. 杭州拓数派科技有限公司,杭州 310000
  • 2. 平湖市政务服务管理办公室,浙江 平湖 314200
  • 折叠

摘要

Abstract

To address the limited robustness of large multimodal models(LMMs)in complex visual scenarios,such as identifying responsibility for fallen trees,which emanates from their reliance on single-path reasoning.This study proposes a novel reasoning optimization method based on Beam Search Chain-of-Thought(BS-CoT).Conventional models often fall into a"first-impression"trap,in which an initial incorrect inference leads to an irreversible analytical failure.The proposed BS-CoT method counteracts this by exploring and evaluating multiple potential inference paths in parallel.It maintains a diverse set of hypotheses about the scene,continuously pruning less likely hypotheses,which effectively overcomes the tendency to commit to a single,fallacious line of reasoning.This significantly enhances visual decision-making capabilities in complex and noisy environments.To validate its efficacy,we constructed a specialized dataset capturing a wide array of treefall incidents in urban governance.Experimental results demonstrated that the proposed method achieved substantial improvements in both event recall and key information capture rates compared with baseline models.This research not only provides a reliable technical solution for visual decision-making challenges in urban public safety but also introduces a new,more robust paradigm for improving the reasoning reliability of large models in critical applications.

关键词

多模态大模型/社会治理/智能体

Key words

large multimodal model/social governance/AI agent

分类

信息技术与安全科学

引用本文复制引用

冯雷,缪思好,李超楠,盛春杰,施宇星,黄奕铖,金剑虹,许韵,杜聿洲,周妮娜..树木倒伏场景中多模态大模型的应用挑战与优化研究[J].华东师范大学学报(自然科学版),2025,(5):53-65,13.

华东师范大学学报(自然科学版)

OA北大核心

1000-5641

访问量0
|
下载量0
段落导航相关论文