<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Peng Wang on sis-arxiv-vad-papers</title><link>https://phuchoang2603.github.io/sis-arxiv-vad-papers/authors/peng-wang/</link><description>Recent content in Peng Wang on sis-arxiv-vad-papers</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Mon, 01 Jan 2024 00:00:00 +0000</lastBuildDate><atom:link href="https://phuchoang2603.github.io/sis-arxiv-vad-papers/authors/peng-wang/index.xml" rel="self" type="application/rss+xml"/><item><title>VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detection</title><link>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/vadclip-adapting-vision-language-models-for-weakly-supervised-video-anomaly-detection/</link><pubDate>Mon, 01 Jan 2024 00:00:00 +0000</pubDate><guid>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/vadclip-adapting-vision-language-models-for-weakly-supervised-video-anomaly-detection/</guid><description>A novel paradigm for weakly supervised video anomaly detection leveraging frozen CLIP model with dual-branch architecture, temporal modeling modules, and prompt mechanisms to utilize vision-language knowledge for both coarse- and fine-grained detection tasks, achieving state-of-the-art performance on benchmarks.</description></item><item><title>AVadCLIP: Audio-Visual Collaboration for Robust Video Anomaly Detection</title><link>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/avadclip-audio-visual-collaboration-for-robust-video-anomaly-detection/</link><pubDate>Sun, 01 Oct 2023 00:00:00 +0000</pubDate><guid>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/avadclip-audio-visual-collaboration-for-robust-video-anomaly-detection/</guid><description>A novel weakly supervised framework leveraging audio-visual collaboration to improve the robustness and accuracy of video anomaly detection.</description></item><item><title>Open-Vocabulary Video Anomaly Detection</title><link>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/wu_open-vocabulary_video_anomaly_detection_cvpr_2024_paper/</link><pubDate>Sun, 01 Oct 2023 00:00:00 +0000</pubDate><guid>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/wu_open-vocabulary_video_anomaly_detection_cvpr_2024_paper/</guid><description>This paper explores open-vocabulary video anomaly detection (OVVAD) leveraging pre-trained large models to detect and categorize seen and unseen anomalies. It proposes a disentangled approach with class-agnostic detection and class-specific classification modules, enhanced by semantic knowledge injection, anomaly synthesis, and joint optimization, to achieve state-of-the-art performance.</description></item><item><title>SlowFastVAD: Video Anomaly Detection via Integrating Simple Detector and RAG-Enhanced Vision-Language Model</title><link>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/slowfastvad-video-anomaly-detection-via-integrating-simpledetector-and-rag-enhanced-vision-language-model/</link><pubDate>Mon, 01 May 2023 00:00:00 +0000</pubDate><guid>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/slowfastvad-video-anomaly-detection-via-integrating-simpledetector-and-rag-enhanced-vision-language-model/</guid><description>Proposes a hybrid framework that integrates a fast anomaly detector with a slow, RAG-enhanced vision-language model to improve efficiency and interpretability in video anomaly detection. It employs a retrieval-augmented reasoning module for better scene-specific adaptation, uses an entropy-based intervention strategy to select ambiguous segments for slow detector analysis, and fuses outputs for robust detection.</description></item><item><title>Towards Video Anomaly Retrieval from Video Anomaly Detection: New Benchmarks and Model</title><link>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/toward-video-anomaly-retrieval-from-video/</link><pubDate>Sun, 01 Jan 2023 00:00:00 +0000</pubDate><guid>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/toward-video-anomaly-retrieval-from-video/</guid><description>Proposes a new task called Video Anomaly Retrieval (VAR), introduces two large-scale benchmarks (UCFCrime-AR and XDViolence-AR), and presents a model called Anomaly-Led Alignment Network (ALAN) for VAR, focusing on retrieving long untrimmed videos using cross-modal queries such as language descriptions and synchronous audios. The work introduces anomaly-led sampling, a pretext task (VPMPM), and cross-modal alignment strategies to address the challenges of VAR in practical scenarios.</description></item></channel></rss>