Vision-Language Models for Multimedia Applications

Introduction

Vision-Language Models (VLMs) are revolutionizing the multimedia landscape by seamlessly integrating visual and textual data for a wide range of applications, such as image captioning, Visual Question Answering (VQA), and multimodal retrieval. This tutorial will explore both foundational and state-of-the-art VLMs, providing attendees with a deep understanding of how these models function and how they can be applied effectively.

Participants will explore the evolution of VLMs from classical architectures like CNNs and RNNs to cutting-edge transformer-based models such as CLIP, BLIP, and finally large vision-language models such as LLaVA. The tutorial will also focus on key challenges such as scaling these models, social/ethical considerations, interpretability and emerging multimedia applications.

Speaker

Dr. Yanbin Liu

Lecturer (Assistant Professor), Auckland University of Technology

Dr. Yanbin Liu is an expert in deep learning and Vision-Language Models, with a focus on their application in multimedia systems. He has published over 30 high-impact research papers in top-tier venues, including CVPR, ICCV, ECCV, and ICLR, amassing over 1,400 citations. Dr. Liu’s research interests center around the integration of visual and textual data, AI-driven content generation, and multimedia retrieval. He has served as Area Chair for ACM Multimedia 2024 and AJCAI 2024, and is a two-time recipient of the CVPR Outstanding Reviewer Award (2021, 2024).

Schedule

Session	Time
Session 1: Introduction to Vision-Language Modeling	09:00 AM - 09:45 AM
Session 2: Vision-Language Modeling Using Deep Learning	09:00 AM - 09:45 AM
Session 3: Recent Advances in Vision-Language Models	09:45 AM - 10:30 AM
Session 4: Challenges and Future Directions	09:45 AM - 10:30 AM

Vision-Language Models for Multimedia Applications:

From Foundations to State-of-the-Art

ACM Multimedia Asia 2024 Tutorial

December 3, 2024

Auckland, New Zealand

Introduction

Speaker

Dr. Yanbin Liu

Schedule