arXiv cs.AI 周报 (20260315)
📚 共 624 篇论文 📅 时间范围: 2026年3月9日 ~ 2026年3月15日 🏷️ 分类: cs.AI

📊 研究方向热度分析

🧠 Reasoning 154篇

推理能力研究关注思维链、多步推理、逻辑演绎等认知能力提升。

  • VLM-SubtleBench: How Far Are VLMs from Human-Level Subtle Comparative Reasoning?
    Minkyu Kim, Sangheon Lee, Dongmin Park
  • Slumbering to Precision: Enhancing Artificial Neural Network Calibration Through Sleep-like Processes
    Jean Erik Delanois, Aditya Ahuja, Giri P. Krishnan et al.
  • Is continuous CoT better suited for multi-lingual reasoning?
    Ali Hamza Bashir, Behzad Shomali, Markus Frey et al.
  • DARC: Disagreement-Aware Alignment via Risk-Constrained Decoding
    Mingxi Zou, Jiaxiang Chen, Junfan Li et al.
  • R2F: Repurposing Ray Frontiers for LLM-free Object Navigation
    Francesco Argenziano, John Mark Alexis Marcelo, Michele Brienza et al.
  • 🤖 LLM Agents 144篇

    大模型智能体研究持续火热,涵盖多智能体协作、工具调用、自主决策等核心能力。

    🛡️ Safety & Alignment 134篇

    安全与对齐研究备受关注,涵盖模型安全、隐私保护、对抗攻击防御等关键议题。

    📚 RAG & Memory 116篇

    检索增强生成与记忆机制研究持续发展,关注动态索引、结构化存储、持续学习等。

    ⚡ Efficiency & Optimization 116篇

    效率优化研究热度攀升,关注量化、剪枝、推理加速等模型压缩与部署技术。

  • DARC: Disagreement-Aware Alignment via Risk-Constrained Decoding
    Mingxi Zou, Jiaxiang Chen, Junfan Li et al.
  • R2F: Repurposing Ray Frontiers for LLM-free Object Navigation
    Francesco Argenziano, John Mark Alexis Marcelo, Michele Brienza et al.
  • Geometrically Constrained Outlier Synthesis
    Daniil Karzanov, Marcin Detyniecki
  • First Estimation of Model Parameters for Neutrino-Induced Nucleon Knockout Using Simulation-Based Inference
    Karla Tame-Narvaez, Steven Gardiner, Aleksandra Ćiprijanović et al.
  • Improving through Interaction: Searching Behavioral Representation Spaces with CMA-ES-IG
    Nathaniel Dennler, Zhonghao Shi, Yiran Tao et al.
  • 📈 Time Series 96篇

    时序预测研究涵盖时空数据建模、多变量预测、异常检测等应用场景。

    🎮 Reinforcement Learning 93篇

    强化学习研究热度高涨,涵盖策略优化、多任务学习、安全RL等前沿方向。

    📁 Other 86篇

    该研究方向持续发展,产出多篇高质量论文。

    🦾 Robotics & Embodied AI 77篇

    机器人与具身智能研究涵盖导航、操控、人机交互等关键应用场景。

  • Long-Short Term Agents for Pure-Vision Bronchoscopy Robotic Autonomy
    Junyang Wu, Mingyi Luo, Fangfang Xie et al.
  • CMMR-VLN: Vision-and-Language Navigation via Continual Multimodal Memory Retrieval
    Haozhou Li, Xiangyu Dong, Huiyan Jiang et al.
  • R2F: Repurposing Ray Frontiers for LLM-free Object Navigation
    Francesco Argenziano, John Mark Alexis Marcelo, Michele Brienza et al.
  • IronEngine: Towards General AI Assistant
    Xi Mo
  • Graph-Instructed Neural Networks for parametric problems with varying boundary conditions
    Francesco Della Santa, Sandra Pieraccini, Maria Strazzullo
  • 👁️ Vision-Language Models 69篇

    视觉语言模型研究持续活跃,关注多模态对齐、跨模态推理等关键问题。

    🔥 本周亮点论文

    Long-Short Term Agents for Pure-Vision Bronchoscopy Robotic Autonomy

    Junyang Wu, Mingyi Luo, Fangfang Xie, Minghui Zhang, Hanxiao Zhang et al.
    Accurate intraoperative navigation is essential for robot-assisted endoluminal intervention, but remains difficult because of limited endoscopic field of view and dynamic artifacts. Existing navigation platforms often rely on external localization technologies, such as electromagnetic tracking or shape sensing, which increase hardware complexity and remain vulnerable to intraoperative anatomical mismatch. We present a vision-only autonomy framework that performs long-horizon bronchoscopic naviga...
    LLM Agents

    ImageEdit-R1: Boosting Multi-Agent Image Editing via Reinforcement Learning

    Yiran Zhao, Yaoqi Ye, Xiang Liu, Michael Qizhe Shieh, Trung Bui
    With the rapid advancement of commercial multi-modal models, image editing has garnered significant attention due to its widespread applicability in daily life. Despite impressive progress, existing image editing systems, particularly closed-source or proprietary models, often struggle with complex, indirect, or multi-step user instructions. These limitations hinder their ability to perform nuanced, context-aware edits that align with human intent. In this work, we propose ImageEdit-R1, a multi-...
    LLM Agents

    PIRA-Bench: A Transition from Reactive GUI Agents to GUI-based Proactive Intent Recommendation Agents

    Yuxiang Chai, Shunye Tang, Han Xiao, Rui Liu, Hongsheng Li
    Current Graphical User Interface (GUI) agents operate primarily under a reactive paradigm: a user must provide an explicit instruction for the agent to execute a task. However, an intelligent AI assistant should be proactive, which is capable of anticipating user intentions directly from continuous visual inputs, such as mobile or desktop screenshots, and offering timely recommendations without explicit user prompting. Transitioning to this proactive paradigm presents significant challenges. Rea...
    LLM Agents

    DARC: Disagreement-Aware Alignment via Risk-Constrained Decoding

    Mingxi Zou, Jiaxiang Chen, Junfan Li, Langzhang Liang, Qifan Wang et al.
    Preference-based alignment methods (e.g., RLHF, DPO) typically optimize a single scalar objective, implicitly averaging over heterogeneous human preferences. In practice, systematic annotator and user-group disagreement makes mean-reward maximization brittle and susceptible to proxy over-optimization. We propose **Disagreement-Aware Alignment via Risk-Constrained Decoding (DARC)**, a retraining-free inference-time method that frames response selection as distributionally robust, risk-sensitive d...
    Reinforcement Learning

    A Hierarchical Error-Corrective Graph Framework for Autonomous Agents with LLM-Based Action Generation

    Cong Cao, Jingyao Zhang, Kun Tong
    We propose a Hierarchical Error-Corrective Graph FrameworkforAutonomousAgentswithLLM-BasedActionGeneration(HECG),whichincorporates three core innovations: (1) Multi-Dimensional Transferable Strategy (MDTS): by integrating task quality metrics (Q), confidence/cost metrics (C), reward metrics (R), and LLM-based semantic reasoning scores (LLM-Score), MDTS achieves multi-dimensional alignment between quantitative performance and semantic context, enabling more precise selection of high-quality candi...
    Reinforcement Learning

    RubiCap: Rubric-Guided Reinforcement Learning for Dense Image Captioning

    Tzu-Heng Huang, Sirajul Salekin, Javier Movellan, Frederic Sala, Manjot Bilkhu
    Dense image captioning is critical for cross-modal alignment in vision-language pretraining and text-to-image generation, but scaling expert-quality annotations is prohibitively expensive. While synthetic captioning via strong vision-language models (VLMs) is a practical alternative, supervised distillation often yields limited output diversity and weak generalization. Reinforcement learning (RL) could overcome these limitations, but its successes have so far been concentrated in verifiable doma...
    Reinforcement Learning

    Alignment-Aware and Reliability-Gated Multimodal Fusion for Unmanned Aerial Vehicle Detection Across Heterogeneous Thermal-Visual Sensors

    Ishrat Jahan, Molla E Majid, M Murugappan, Muhammad E. H. Chowdhury, N. B. Prakash et al.
    Reliable unmanned aerial vehicle (UAV) detection is critical for autonomous airspace monitoring but remains challenging when integrating sensor streams that differ substantially in resolution, perspective, and field of view. Conventional fusion methods-such as wavelet-, Laplacian-, and decision-level approaches-often fail to preserve spatial correspondence across modalities and suffer from annotation of inconsistencies, limiting their robustness in real-world settings. This study introduces two ...
    Safety & Alignment

    UNBOX: Unveiling Black-box visual models with Natural-language

    Simone Carnemolla, Chiara Russo, Simone Palazzo, Quentin Bouniot, Daniela Giordano et al.
    Ensuring trustworthiness in open-world visual recognition requires models that are interpretable, fair, and robust to distribution shifts. Yet modern vision systems are increasingly deployed as proprietary black-box APIs, exposing only output probabilities and hiding architecture, parameters, gradients, and training data. This opacity prevents meaningful auditing, bias detection, and failure analysis. Existing explanation methods assume white- or gray-box access or knowledge of the training dist...
    Safety & Alignment

    👥 作者关系图谱分析

    以下展示了本周cs.AI论文中发表量最多的作者及其合作关系。节点大小表示论文数量,连线粗细表示合作频次。

    作者合作关系网络 Hao Wang Arman Cohan Lei Zhang Mohamad Alka Halim Yaniko Fan Yang 5+篇论文 4篇论文 3篇论文

    🏆 高产作者榜单

    Hao Wang (5篇)

    本周在cs.AI领域发表多篇论文,研究方向涵盖多个前沿领域。

    Arman Cohan (4篇)

    本周在cs.AI领域发表多篇论文,研究方向涵盖多个前沿领域。

    Lei Zhang (4篇)

    本周在cs.AI领域发表多篇论文,研究方向涵盖多个前沿领域。

    Mohamad Alkadamani (4篇)

    本周在cs.AI领域发表多篇论文,研究方向涵盖多个前沿领域。

    Halim Yanikomeroglu (4篇)

    本周在cs.AI领域发表多篇论文,研究方向涵盖多个前沿领域。

    Fan Yang (4篇)

    本周在cs.AI领域发表多篇论文,研究方向涵盖多个前沿领域。

    Mikhail Pautov (4篇)

    本周在cs.AI领域发表多篇论文,研究方向涵盖多个前沿领域。

    Yiran Zhao (3篇)

    本周在cs.AI领域发表多篇论文,研究方向涵盖多个前沿领域。

    📈 研究趋势洞察

    🤖 智能体安全研究升温

    多智能体系统的安全问题成为研究热点,MCP服务器风险评估、智能体攻击防御等方向涌现多篇重要工作。

    🧠 持续学习与记忆机制

    LLM上下文窗口的需求分页技术、持续学习的记忆机制等研究为智能体长期运行提供了新的解决方案。

    🦾 具身智能安全评估

    LABSHIELD、HomeSafe-Bench等基准测试的出现,标志着具身智能安全评估进入标准化阶段。

    ⚡ 效率优化深入架构

    Expert Threshold Routing、状态空间模型编译器优化等工作推动了大模型效率优化的架构级创新。

    🏥 医疗AI多模态发展

    Meissa等多模态医疗智能体系统的出现,展示了轻量级医疗AI的实际应用潜力。

    🎨 生成模型新范式

    ConvNeXt扩散模型、视频到音乐生成等研究展示了生成模型架构创新的新方向。

    📑 分类统计概览

    研究方向论文数量分布 154 Agents 144 RL 134 Safety 116 RAG 116 VLM 96 Efficiency 93 Robotics 86 Medical 77 Diffusion 69 GNN 154 77 0

    数据来源: arXiv cs.AI 分类 | 生成时间: 2026年3月15日

    本报告由自动化工具生成,仅供参考