Chen Sun

2023

Towards Generic Anomaly Detection and Understanding: Large-scale Visual-linguistic Model (GPT-4V) Takes the Lead

31 October 2023·12272 words·58 mins

Yunkang Cao , Xiaohao Xu , Chen Sun , Xiaonan Huang , Weiming Shen

This study explores the use of GPT-4V, a large visual-linguistic model, for generic anomaly detection across multiple modalities and domains, demonstrating its ability to understand global and fine-grained semantics, reason automatically, and improve with prompts. It evaluates GPT-4V on diverse tasks including industrial, medical, logical, video, 3D, and time series anomaly detection, discussing its promising performance and future directions for enhancement, such as quantitative metrics, expanded benchmarks, multi-round interactions, human feedback, and real-time application.

Chen Sun

2025

Personalizing Vision-Language Models With Hybrid Prompts for Zero-Shot Anomaly Detection

2023

Towards Generic Anomaly Detection and Understanding: Large-scale Visual-linguistic Model (GPT-4V) Takes the Lead