Papers

2023

Anomaly-Led Prompting Learning Caption Generating Model and Benchmark

1 October 2023·12528 words·59 mins

Qianyue Bao , Fang Liu , Licheng Jiao , Yang Liu , Shuo Li , Lingling Li , Xu Liu , Xinyi Wang , Baoliang Chen

Introduces a new task for comprehensive video anomaly captioning, proposes a large-scale benchmark dataset CVACBench with fine-grained annotations, and designs a baseline model AGPFormer using prompt learning to improve anomaly understanding and description accuracy.

An Attribute-based Method for Video Anomaly Detection

1 October 2023·9752 words·46 mins

Tal Reiss , Yedid Hoshen

Shanghaitech Ucf-Crime Weakly Supervised Semi Supervised Method

A simple attribute-based approach that represents each object by velocity and pose attributes, combining these with deep representations, and uses density estimation for anomaly scoring, achieving state-of-the-art performance.

Aligning Effective Tokens with Video Anomaly in Large Language Models

1 October 2023·8317 words·40 mins

Yingxian Chen , Jiahui Liu , Ruidi Fan , Yanwei Li , Chirui Chang , Shizhen Zhao , Wilton W.T.Fok , Xiaojuan Qi , Yik-Chung Wu

Xd-Violence Hybrid Other

Proposes VA-GPT, a multimodal Large Language Model for video anomaly detection and understanding, utilizing effective token selection and generation modules (SETS and TETG) to improve spatial and temporal localization of anomalies. Introduces instruct-following fine-tuning data and cross-domain benchmarks for robustness evaluation.

Action Hints: Semantic Typicality and Context Uniqueness for Generalizable Skeleton-based Video Anomaly Detection

1 October 2023·7886 words·38 mins

Canhui Tang , Sanping Zhou , Haoyue Shi , Le Wang

Shanghaitech Ubnormal Ucf-Crime Semi Supervised Instruction Tuning Method

Proposes a zero-shot skeleton-based video anomaly detection framework utilizing action semantic typicality and context uniqueness learning, involving a language-guided typicality modeling module and a test-time context uniqueness analysis module, achieving state-of-the-art results without target domain training data.

Action Hints: Semantic Typicality and Context Uniqueness for Generalizable Skeleton-based Video Anomaly Detection

1 October 2023·7886 words·38 mins

Canhui Tang , Sanping Zhou , Haoyue Shi , Le Wang

Shanghaitech Ubnormal Ucf-Crime Hybrid Method

Proposes a zero-shot skeleton-based video anomaly detection framework leveraging action semantic typicality and context uniqueness learning, utilizing language-guided semantic modeling and test-time scene-adaptive boundaries to improve generalization without target domain training data.

A VLM-based Method for Visual Anomaly Detection in Robotic Scientific Laboratories

1 October 2023·4535 words·22 mins

Shiwei Lin , Chenxu Wang , Xiaozhen Ding , Yi Wang , Boyuan Du , Lei Song , Chenggang Wang , Huaping Liu

Other Semi Supervised Method

Proposes a vision-language reasoning approach utilizing hierarchical prompts and Chain-of-Thought inference for process anomaly detection in scientific experiments. Constructs a benchmark based on real chemical laboratory workflows and demonstrates improved accuracy with prompt granularity, validated through real-world robotic lab testing.

A Survey on Video Anomaly Detection via Deep Learning: Human, Vehicle, and Environment

1 October 2023·18514 words·87 mins

Ghazal Alinezhad Noghre , Armin Danesh Pazho , Hamed Tabkhi

Cuhk-Avenue Shanghaitech Xd-Violence Ucf-Crime Ucsd-Ped Other Semi Supervised Unsupervised Instruction Tuning Hybrid Survey

This survey provides a comprehensive overview of deep learning-based Video Anomaly Detection (VAD), covering challenges, methodologies, domain-specific applications, and future research directions across human-centric, vehicle-centric, and environment-centric contexts. It introduces a taxonomy of supervision levels, adaptive learning strategies, and explores diverse application areas including healthcare, public safety, road surveillance, and disaster detection, emphasizing the latest advancements and open challenges.

Advanced Video Anomaly Detection Using Deep Learning

15 July 2023·8275 words·39 mins

Jane Doe , John Smith

Ucf-Crime Shanghaitech Semi Supervised Training Free Method

This paper introduces a novel deep learning framework for detecting anomalies in video content by leveraging semi-supervised approaches that require minimal labeled data, enhancing robustness and efficiency.

SlowFastVAD: Video Anomaly Detection via Integrating Simple Detector and RAG-Enhanced Vision-Language Model

1 May 2023·9715 words·46 mins

Zongcan Ding , Guansong Pang , Haodong Zhang , Zhiwei Yang , Yanning Zhang , Peng Wu , Peng Wang , Jing Liu , Fang Shen , Changkang Li

Ucsd-Ped Shanghaitech Xd-Violence Ubnormal Semi Supervised Hybrid Method

Proposes a hybrid framework that integrates a fast anomaly detector with a slow, RAG-enhanced vision-language model to improve efficiency and interpretability in video anomaly detection. It employs a retrieval-augmented reasoning module for better scene-specific adaptation, uses an entropy-based intervention strategy to select ambiguous segments for slow detector analysis, and fuses outputs for robust detection.

Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model

1 January 2023·10197 words·48 mins

Peng Wu , Jing Liu , Xiangteng He , Yuxin Peng , Peng Wang , Yanning Zhang

Ucf-Crime Shanghaitech Hybrid Other

Proposes a new task called Video Anomaly Retrieval (VAR), introduces two large-scale benchmarks (UCFCrime-AR and XDViolence-AR), and presents a model called Anomaly-Led Alignment Network (ALAN) for VAR, focusing on retrieving long untrimmed videos using cross-modal queries such as language descriptions and synchronous audios. The work introduces anomaly-led sampling, a pretext task (VPMPM), and cross-modal alignment strategies to address the challenges of VAR in practical scenarios.

Text-Driven Traffic Anomaly Detection with Temporal High-Frequency Modeling in Driving Videos

1 January 2023·10710 words·51 mins

Rongqin Liang , Yuanman Li , Jiantao Zhou , Xia Li

Cuhk-Avenue Shanghaitech Hybrid Other

Introduces a novel single-stage approach (TTHF) for traffic anomaly detection that aligns video clips with text prompts and models high-frequency temporal changes, enhanced by an attention focusing mechanism, outperforming state-of-the-art methods on benchmark datasets.

TEVAD: Improved video anomaly detection with captions

1 January 2023·7563 words·36 mins

Weiling Chen , Keng Teck Ma , Zi Jian Yew , Minhoe Hur , David Aik-Aun Khoo

Shanghaitech Ucf-Crime Xd-Violence Ucsd-Ped Weakly Supervised Method

Proposes a framework that utilizes both visual and text features, generated through dense video captions, to enhance anomaly detection performance and explainability in videos.

Hierarchical Semantic Contrast for Scene-aware Video Anomaly Detection

1 January 2023·7920 words·38 mins

Shengyang Sun , Xiaojin Gong

Ucsd-Ped Shanghaitech Other Semi Supervised Other

The paper proposes a hierarchical semantic contrast (HSC) method that leverages scene-aware autoencoders, semantic contrastive learning, and motion augmentation for improved scene-dependent and scene-independent video anomaly detection. It incorporates pre-trained video parsing models, hierarchical contrastive learning at scene and object levels, and skeleton-based motion augmentation to make the normal feature representations more compact and discriminative, thereby enhancing anomaly detection performance.

Generating Anomalies for Video Anomaly Detection with Prompt-based Feature Mapping

1 January 2023·8035 words·38 mins

Zuhao Liu , Xiao-Ming Wu , Dian Zheng , Kun-Yu Lin , Wei-Shi Zheng

Shanghaitech Xd-Violence Ucf-Crime Hybrid Method

The paper proposes a prompt-based feature mapping framework (PFMF) to generate unseen anomalies with unbounded types and narrow the scene gap for video anomaly detection, outperforming state-of-the-art methods on multiple datasets.

Delving into CLIP latent space for Video Anomaly Recognition

1 January 2023·11434 words·54 mins

Luca Zanella , Benedetta Liberatori , Willi Menapace , Fabio Poiesi , Yiming Wang , Elisa Riccia

Shanghaitech Ucf-Crime Xd-Violence Semi Supervised Other

Proposes AnomalyCLIP, a novel method leveraging Large Language and Vision (LLV) models like CLIP, combined with multiple instance learning and a re-centring transformation of the CLIP feature space, to detect and classify video anomalies and recognize anomaly types. Introduces a Selector model with prompt learning and a Temporal Transformer-based model for temporal dependency modeling; demonstrates state-of-the-art performance on multiple benchmarks.

2021

TransAnomaly: Video Anomaly Detection Using Video Vision Transformer

30 August 2021·6645 words·32 mins

HONGCHUN YUAN , ZHENYU CAI , HUI ZHOU , YUE WANG , XIANGZHI CHEN

Shanghaitech Ucf-Crime Other Semi Supervised Method

A prediction-based video anomaly detection approach combining U-Net and Video Vision Transformer (ViViT), with modifications for video prediction, capturing richer temporal and global context information, enabling anomaly localization.

↑