Yalong Jiang

2024

VLAVAD: Vision-Language Models Assisted Unsupervised Video Anomaly Detection

1 January 2024·6374 words·30 mins

Shanghaitech Unsupervised Instruction Tuning Hybrid Method

Proposes VLAVAD, an unsupervised video anomaly detection method leveraging vision-language pre-trained models, utilizing semantic features, Selective Prompt Adapter, and Sequence State Space Module to improve interpretability and transferability, achieving state-of-the-art performance on the ShanghaiTech dataset.

↑