<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Yiming Wang on sis-arxiv-vad-papers</title><link>https://phuchoang2603.github.io/sis-arxiv-vad-papers/authors/yiming-wang/</link><description>Recent content in Yiming Wang on sis-arxiv-vad-papers</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Sun, 01 Oct 2023 00:00:00 +0000</lastBuildDate><atom:link href="https://phuchoang2603.github.io/sis-arxiv-vad-papers/authors/yiming-wang/index.xml" rel="self" type="application/rss+xml"/><item><title>Harnessing Large Language Models for Training-free Video Anomaly Detection</title><link>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/zanella_harnessing_large_language_models_for_training-free_video_anomaly_detection_cvpr_2024_paper/</link><pubDate>Sun, 01 Oct 2023 00:00:00 +0000</pubDate><guid>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/zanella_harnessing_large_language_models_for_training-free_video_anomaly_detection_cvpr_2024_paper/</guid><description>Introduces a training-free method for video anomaly detection (VAD) leveraging pre-trained large language models (LLMs) and vision-language models (VLMs). Proposes techniques for caption cleaning, scene description, and anomaly scoring without additional training, demonstrating superior performance on surveillance datasets.</description></item><item><title>Delving into CLIP latent space for Video Anomaly Recognition</title><link>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/delving-into-clip-latent-space-for-video-anomaly-recognition/</link><pubDate>Sun, 01 Jan 2023 00:00:00 +0000</pubDate><guid>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/delving-into-clip-latent-space-for-video-anomaly-recognition/</guid><description>Proposes AnomalyCLIP, a novel method leveraging Large Language and Vision (LLV) models like CLIP, combined with multiple instance learning and a re-centring transformation of the CLIP feature space, to detect and classify video anomalies and recognize anomaly types. Introduces a Selector model with prompt learning and a Temporal Transformer-based model for temporal dependency modeling; demonstrates state-of-the-art performance on multiple benchmarks.</description></item></channel></rss>