Guansong Pang on sis-arxiv-vad-papers

Guansong Pang on sis-arxiv-vad-papershttps://phuchoang2603.github.io/sis-arxiv-vad-papers/authors/guansong-pang/Recent content in Guansong Pang on sis-arxiv-vad-papersHugo -- gohugo.ioenMon, 01 Jan 2024 00:00:00 +0000VadCLIP: Adapting Vision-Language Models for Weakly Supervised Video Anomaly Detectionhttps://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/vadclip-adapting-vision-language-models-for-weakly-supervised-video-anomaly-detection/Mon, 01 Jan 2024 00:00:00 +0000https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/vadclip-adapting-vision-language-models-for-weakly-supervised-video-anomaly-detection/A novel paradigm for weakly supervised video anomaly detection leveraging frozen CLIP model with dual-branch architecture, temporal modeling modules, and prompt mechanisms to utilize vision-language knowledge for both coarse- and fine-grained detection tasks, achieving state-of-the-art performance on benchmarks.AssistPDA: An Online Video Surveillance Assistant for Video Anomaly Prediction, Detection, and Analysishttps://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/assistpda-an-online-video-surveillance-assistant-for-video-anomaly-prediction-detection-and-analysis/Sun, 01 Oct 2023 00:00:00 +0000https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/assistpda-an-online-video-surveillance-assistant-for-video-anomaly-prediction-detection-and-analysis/Introducing AssistPDA, a pioneering framework for real-time online video anomaly prediction, detection, and analysis leveraging vision-language models with a novel spatiotemporal relation distillation module and constructed benchmark dataset VAPDA-127K.AVadCLIP: Audio-Visual Collaboration for Robust Video Anomaly Detectionhttps://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/avadclip-audio-visual-collaboration-for-robust-video-anomaly-detection/Sun, 01 Oct 2023 00:00:00 +0000https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/avadclip-audio-visual-collaboration-for-robust-video-anomaly-detection/A novel weakly supervised framework leveraging audio-visual collaboration to improve the robustness and accuracy of video anomaly detection.Open-Vocabulary Video Anomaly Detectionhttps://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/wu_open-vocabulary_video_anomaly_detection_cvpr_2024_paper/Sun, 01 Oct 2023 00:00:00 +0000https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/wu_open-vocabulary_video_anomaly_detection_cvpr_2024_paper/This paper explores open-vocabulary video anomaly detection (OVVAD) leveraging pre-trained large models to detect and categorize seen and unseen anomalies. It proposes a disentangled approach with class-agnostic detection and class-specific classification modules, enhanced by semantic knowledge injection, anomaly synthesis, and joint optimization, to achieve state-of-the-art performance.SlowFastVAD: Video Anomaly Detection via Integrating Simple Detector and RAG-Enhanced Vision-Language Modelhttps://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/slowfastvad-video-anomaly-detection-via-integrating-simpledetector-and-rag-enhanced-vision-language-model/Mon, 01 May 2023 00:00:00 +0000https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/slowfastvad-video-anomaly-detection-via-integrating-simpledetector-and-rag-enhanced-vision-language-model/Proposes a hybrid framework that integrates a fast anomaly detector with a slow, RAG-enhanced vision-language model to improve efficiency and interpretability in video anomaly detection. It employs a retrieval-augmented reasoning module for better scene-specific adaptation, uses an entropy-based intervention strategy to select ambiguous segments for slow detector analysis, and fuses outputs for robust detection.