Yiming Wang

2023

Harnessing Large Language Models for Training-free Video Anomaly Detection

1 October 2023·6913 words·33 mins

Luca Zanella , Willi Menapace , Massimiliano Mancini , Yiming Wang , Elisa Ricci

Introduces a training-free method for video anomaly detection (VAD) leveraging pre-trained large language models (LLMs) and vision-language models (VLMs). Proposes techniques for caption cleaning, scene description, and anomaly scoring without additional training, demonstrating superior performance on surveillance datasets.

Delving into CLIP latent space for Video Anomaly Recognition

1 January 2023·11434 words·54 mins

Luca Zanella , Benedetta Liberatori , Willi Menapace , Fabio Poiesi , Yiming Wang , Elisa Riccia

Shanghaitech Ucf-Crime Xd-Violence Semi Supervised Other

Proposes AnomalyCLIP, a novel method leveraging Large Language and Vision (LLV) models like CLIP, combined with multiple instance learning and a re-centring transformation of the CLIP feature space, to detect and classify video anomalies and recognize anomaly types. Introduces a Selector model with prompt learning and a Temporal Transformer-based model for temporal dependency modeling; demonstrates state-of-the-art performance on multiple benchmarks.

↑