<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom"><channel><title>Jialong Zuo on sis-arxiv-vad-papers</title><link>https://phuchoang2603.github.io/sis-arxiv-vad-papers/authors/jialong-zuo/</link><description>Recent content in Jialong Zuo on sis-arxiv-vad-papers</description><generator>Hugo -- gohugo.io</generator><language>en</language><lastBuildDate>Sun, 01 Oct 2023 00:00:00 +0000</lastBuildDate><atom:link href="https://phuchoang2603.github.io/sis-arxiv-vad-papers/authors/jialong-zuo/index.xml" rel="self" type="application/rss+xml"/><item><title>Holmes-VAD: Towards Unbiased and Explainable Video Anomaly Detection via Multi-modal LLM</title><link>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/holmes-vad-towards-unbiased-and-explainable-video-anomaly-detection-via-multi-modal-llm/</link><pubDate>Sun, 01 Oct 2023 00:00:00 +0000</pubDate><guid>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/holmes-vad-towards-unbiased-and-explainable-video-anomaly-detection-via-multi-modal-llm/</guid><description>A novel framework leveraging multimodal instructions and large-scale datasets to enable unbiased, interpretable, and accurate video anomaly detection with large language models, including a new dataset VAD-Instruct50k with single-frame annotations and explanatory instruction data.</description></item><item><title>Holmes-VAU: Towards Long-term Video Anomaly Understanding at Any Granularity</title><link>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/holmes-vau-towards-long-term-video-anomaly-understanding-at-any-granularity/</link><pubDate>Sun, 01 Oct 2023 00:00:00 +0000</pubDate><guid>https://phuchoang2603.github.io/sis-arxiv-vad-papers/papers/holmes-vau-towards-long-term-video-anomaly-understanding-at-any-granularity/</guid><description>A semi-automated hierarchical video annotation framework combined with a novel Anomaly-focused Temporal Sampler and a multimodal large language model, aimed at comprehensive understanding of complex and long-term video anomalies across multiple temporal scales.</description></item></channel></rss>