Vad-R1: Towards Video Anomaly Reasoning via Perception-to-Cognition Chain-of-Thought
·11169 words·53 mins
Proposes a structured Perception-to-Cognition Chain-of-Thought and introduces Vad-Reasoning dataset, along with an improved reinforcement learning algorithm AVA-GRPO, to enhance the deep reasoning capabilities of Multimodal Large Language Models in video anomaly detection and understanding tasks.
