Instruction Tuning

2025

Personalizing Vision-Language Models With Hybrid Prompts for Zero-Shot Anomaly Detection

13 February 2025·8885 words·42 mins

Yunkang Cao , Xiaohao Xu , Yuqi Cheng , Chen Sun , Zongwei Du , Liang Gao , Weiming Shen

Cuhk-Avenue Shanghaitech Xd-Violence Ubnormal Ucf-Crime Ucsd-Ped Other Weakly Supervised Semi Supervised Training Free Instruction Tuning Unsupervised Hybrid Other

Introduces AnomalyVLM, a framework leveraging hybrid prompts derived from prior knowledge to enhance zero-shot anomaly detection by personalizing vision-language models, incorporating an anomaly region generator and refiner, and utilizing hybrid prompts for category-specific customization and improved detection performance.

PLOVAD: Prompting Vision-Language Models for Open Vocabulary Video Anomaly Detection

10 January 2025·10371 words·49 mins

Chenting Xu , Ke Xu , Xinghao Jiang , Tanfeng Sun

Ucf-Crime Shanghaitech Xd-Violence Ubnormal Weakly Supervised Instruction Tuning Unsupervised Hybrid Method

A novel framework (PLOVAD) leveraging prompt tuning on large-scale pretrained image-based vision-language models for open vocabulary video anomaly detection, incorporating domain-specific and anomaly-specific prompts, and a temporal module to detect and categorize both seen and unseen anomalies with limited parameters.

2024

VLAVAD: Vision-Language Models Assisted Unsupervised Video Anomaly Detection

1 January 2024·6374 words·30 mins

Changkang Li , Yalong Jiang

Shanghaitech Unsupervised Instruction Tuning Hybrid Method

Proposes VLAVAD, an unsupervised video anomaly detection method leveraging vision-language pre-trained models, utilizing semantic features, Selective Prompt Adapter, and Sequence State Space Module to improve interpretability and transferability, achieving state-of-the-art performance on the ShanghaiTech dataset.

2023

VADSK: VIDEO ANOMALY DETECTION WITH STRUCTURED KEYWORDS

1 October 2023·6806 words·32 mins

Thomas Foltz

Ucsd-Ped Shanghaitech Cuhk-Avenue Semi Supervised Instruction Tuning Method

A lightweight, interpretable, two-stage video anomaly detection pipeline employing foundational models for frame description generation and keyword-based classification, achieving comparable performance to state-of-the-art methods with real-time inference and enhanced interpretability.

Action Hints: Semantic Typicality and Context Uniqueness for Generalizable Skeleton-based Video Anomaly Detection

1 October 2023·7886 words·38 mins

Canhui Tang , Sanping Zhou , Haoyue Shi , Le Wang

Shanghaitech Ubnormal Ucf-Crime Semi Supervised Instruction Tuning Method

Proposes a zero-shot skeleton-based video anomaly detection framework utilizing action semantic typicality and context uniqueness learning, involving a language-guided typicality modeling module and a test-time context uniqueness analysis module, achieving state-of-the-art results without target domain training data.

A Survey on Video Anomaly Detection via Deep Learning: Human, Vehicle, and Environment

1 October 2023·18514 words·87 mins

Ghazal Alinezhad Noghre , Armin Danesh Pazho , Hamed Tabkhi

Cuhk-Avenue Shanghaitech Xd-Violence Ucf-Crime Ucsd-Ped Other Semi Supervised Unsupervised Instruction Tuning Hybrid Survey

This survey provides a comprehensive overview of deep learning-based Video Anomaly Detection (VAD), covering challenges, methodologies, domain-specific applications, and future research directions across human-centric, vehicle-centric, and environment-centric contexts. It introduces a taxonomy of supervision levels, adaptive learning strategies, and explores diverse application areas including healthcare, public safety, road surveillance, and disaster detection, emphasizing the latest advancements and open challenges.

↑