Text Prompt with Normality Guidance for Weakly Supervised Video Anomaly Detection
Proposes a novel pseudo-label generation and self-training framework incorporating CLIP for text-image alignment, learnable text prompts, normality visual prompts, a pseudo-label generation module guided by normality clues, and a self-adaptive temporal dependence learning module, achieving state-of-the-art performance on benchmark datasets.
