Skip to main content

Tommie Kerssies

2023

Simplifying Traffic Anomaly Detection with Video Foundation Models

·7027 words·33 mins
The paper investigates the use of simple encoder-only Video Vision Transformers (Video ViTs) with various pre-training strategies for traffic anomaly detection (TAD), demonstrating that with strong pretraining and domain adaptation, minimal architectural complexity can outperform complex prior methods, highlighting the importance of pretraining strategies like Masked Video Modeling (MVM).