Skip to main content

Peiying Yu

2025

Sherlock: Towards Multi-scene Video Abnormal Event Extraction and Localization via a Global-local Spatial-sensitive LLM

Proposes a new task (M-VAE) for structured extraction and localization of abnormal events in videos, introduces Sherlock model with a Global-local Spatial-sensitive MoE module and a Spatial Imbalance Regulator, and demonstrates its effectiveness through extensive experiments.