Yi Ding's Homepage

About Me

Hi there! I'm an PhD student at Purdue University Purdue Logo , Department of Computer Science, advised by Dr. Ruqi Zhang. I obtained my B.S. degree at the School of Mathematics, Tianjin University TJU Logo . Previously, I worked as a research assistant in the MLDM Lab's Multimodal Vision Processing (MVP) Group, under the guidance of Dr. Bing Cao.

My research interests lie in developing reliable machine learning algorithms and frameworks for real-world applications, with a particular focus on the alignment of Large Foundation Models (LLMs and VLMs) and the generalization of multimodal learning algorithms.

Research Interests

Multimodal Learning

Multimodal Fusion, Imbalanced Multimodal Learning

Fusion Imbalanced Data

Foundation Models

Alignment of LLMs and VLMs

LLM VLM Alignment

Trustworthy AI

Safety, Uncertainty, Reliability

Safety Uncertainty

Open to collaborations! Feel free to reach out if our research interests align.

Latest News

Jan 2026

🎉 1 paper was accepted by ICLR 2026!

Sep 2025

🎉 1 paper was accepted by NeurIPS 2025! Congratulations to all Collaborators!

Aug 2025

🎉 1 paper was accepted by EMNLP 2025 Main Conference. Congratulations to all Collaborators!

May 2025

🎉 Yi will give a talk about VLM safety at Shenlan School!

Apr 2025

🎉 Yi serves as Reviewer of NeurIPS 2025!

Jan 2025

🎉 Our paper, dataset, and models about VLM Multi-Image Safety (MIS) are released now!

Jan 2025

🎉 Our paper about MLLM safety alignment is accepted at ICLR 2025. Congratulations to all Collaborators!

Sep 2024

🎉 Yi serves as Reviewer of ICLR 2025!

Sep 2024

🎉 Our paper about Dynamic Image Fusion without additional training is accepted at NeurIPS 2024! Congratulations to all Collaborators!

Jul 2024

🎉 Yi will make a poster presentation at Tue 23 Jul 1:30 p.m. — 3 p.m. on ICML Hall C 4-9 #2817, Vienna, Austria!

May 2024

🎉 Our paper about Multimodal Fusion is accepted at ICML 2024!

Publications

* indicates author with equal contribution.

Preprint

Learning Self-Correction in Vision–Language Models via Rollout Augmentation

Yi Ding, Ziliang Qiu, Bolian Li, Ruqi Zhang

TL;DR: We propose Octopus, an RL rollout augmentation framework that synthesizes dense self-correction examples by recombining existing rollouts. Octopus-8B achieves SoTA performance by advancing reasoning and self-correction capabilities.

Under Review, 2026

Paper Code Project

Preprint

Modular Safety Guardrails Are Necessary for Foundation-Model-Enabled Robots in the Real World

Joonkyung Kim, Wenxi Chen, Davood Soleymanzadeh, Yi Ding, Xiangbo Gao, Zhengzhong Tu, Ruqi Zhang, Fan Fei, Sushant Veer, Yiwei Lyu, Minghui Zheng, Yan Gu

TL;DR: We propose modular safety guardrails with monitoring and intervention layers, and show how cross-layer co-design enables faster, less conservative, and more effective safety for physical AI.

Under Review, 2026

Paper

Technical Report

SafeWork-R1: Coevolving Safety and Intelligence under the AI-45◦ Law

Shanghai Artificial Intelligence Laboratory, ..., Yi Ding, [and 100+ authors]

TL;DR: We introduce SafeWork-R1, a cutting-edge multimodal reasoning model that demonstrates the coevolution of capabilities and safety.

Technical Report, 2025

Paper

ICLR 2026

Rethinking Bottlenecks in Safety Fine-Tuning of Vision Language Models

Yi Ding*, Lijun Li*, Bing Cao, Jing Shao

TL;DR: Introducing the first multi-image safety (MIS) dataset, which includes both training and test splits. The VLMs fine-tuned with the MIRage method and MIS training set to improve both the safety and general performance of the models.

International Conference on Learning Representations (ICLR), 2026

Paper Code Project

NeurIPS 2025