π About Me
Hi! Iβm a final-year undergraduate student from School of Mathematics, Tianjin University . Currently, I am an research intern at RZ-Lab, Purdue University
, advised by Dr. Ruqi Zhang. Meanwhile, Iβm also a research assistant in the MLDM Lab Multimodal Vision Processing (MVP) Group, advised by Dr. Bing Cao and Prof. Qinghua Hu. I am interested in building reliable machine learning algorithms/frameworks in the real world, especially on the alignment of Large Foundation Models (LLMs, VLMs) and the generalization of multimodal learning algorithms.
𧩠Research Interests
- Multimodal Learning: Multimodal Fusion, Imbalanced Multimodal Learning.
- Alignment of Foundation Models: LLMs, VLMs.
- Trustworthy AI: Safety, Uncertainty, etc.
I'm finding 25 Fall PhD opportunities! You can find my CV here. If you share the same research interests with me, welcome to add my Wechat
π₯ News
- [Jan. 2025]: Β πππ Our paper, dataset, and models about MLLM Multi-Image Safety (MIS) is released now!
- [Jan. 2025]: Β πππ Our paper about MLLM safety alignment is accepted at ICLR 2025. Congratulations to all Collaborators!
- [Sep. 2024]: Β πππ Yi serves as Reviewer of ICLR 2025!
- [Sep. 2024]: Β πππ Our paper about Dynamic Image Fusion without additional training is accepted at NeurIPS 2024! Congratulations to all Collaborators!
- [Jul. 2024]: Β πππ Yi will make a poster presentation at Tue 23 Jul 1:30 p.m. β 3 p.m. on ICML Hall C 4-9 #2817, Vienna, Austria!
- [May. 2024]: Β πππ Our paper about Multimodal Fusion is accepted at ICML 2024!
π Publications & Preprints
* indicates author with equal contribution.

Rethinking Bottlenecks in Safety Fine-Tuning of Vision Language Models
Yi Ding*, Lijun Li*, Bing Cao, Jing Shao
TL;DR: Introducing the first multi-image safety (MIS) dataset, which includes both training and test splits. The VLMs fine-tuned with the MIRage method and MIS training set to improve both the safety and general performance of the models. Preprint, arXiv 2025

ETA: Evaluating Then Aligning Safety of Vision Language Models at Inference Time
Yi Ding, Bolian Li, Ruqi Zhang
TL;DR: Establishing multimodal safety mechanism for VLMs and enhancing harmlessness and helpfulness of responses without additional training.
International Conference on Learning Representations (ICLR), 2025


Predictive Dynamic Fusion
Bing Cao, Yinan Xia*, Yi Ding*, Changqing Zhang, Qinghua Hu
TL;DR: The key to dynamic fusion lies in the correlation between the weights and the loss, providing generalization theory for decision-level fusion.
International Conference on Machine Learning (ICML), 2024
π Educations
- 2024.05 - Present, Research Intern, Computer Science, Purdue University
- 2021.08 - Present, B.S., School of Mathematics, Tianjin University