SlowFastVAD: Video Anomaly Detection via Integrating Simple Detector and RAG-Enhanced Vision-Language Model
·9715 words·46 mins
Zongcan Ding
,
Guansong Pang
,
Haodong Zhang
,
Zhiwei Yang
,
Yanning Zhang
,
Peng Wu
,
Peng Wang
,
Jing Liu
,
Fang Shen
,
Changkang Li
Proposes a hybrid framework that integrates a fast anomaly detector with a slow, RAG-enhanced vision-language model to improve efficiency and interpretability in video anomaly detection. It employs a retrieval-augmented reasoning module for better scene-specific adaptation, uses an entropy-based intervention strategy to select ambiguous segments for slow detector analysis, and fuses outputs for robust detection.
