Edit
About Me
I am currently a Research Scientist at Adobe Research. My research interests include multimodal, natural language processing and reinforcement learning. More specifically, I focused on the subfields:
- Multimodal LLM for text‑rich images understanding and reasoning
- Multimodal generative models aligned with user preference
- Saving training or dataset construction costs for multimodal models
Before that, I obtained my Ph.D. from the
Department of Computer Science,
Duke University. My thesis is about
Uncertainty Estimation in Deep Reinforcement Learning and my Ph.D. advisor is Professor
Lawrence Carin. I received my B.Sc. degree at
Nanjing University in 2016. I was a member of
LAMDA Group, led by Dr.
Zhi-Hua Zhou.
I worked as a research intern at
AWS AI Lab (New York, Summer 2020),
Google Brain (Mountain View, Summer 2019),
Samsung Research America (Mountain View, Spring 2019 & 2020),
Adobe Research (San Jose, Summer 2018), and
Alibaba AntAI (Hangzhou, Summer 2016).
I am always happy to collaborate with enthusiastic and talented students on topics related to multimodal large language models and reinforcement learning. If you are interested in working with me, feel free to reach out. [
previous interns/collaborators]
EditNews
- Jan. 2025: One paper has been accepted to NAACL 2025.
- Jan. 2025: One paper has been accepted to ICLR 2025.
- Dec. 2024: One paper has been accepted to AAAI 2025.
- Nov. 2024: One paper has been accepted to COLING 2025.
- Sep. 2024: One paper has been accepted to EMNLP 2024.
- July 2024: Two papers have been accepted to COLM 2024.
- Mar. 2024: Two papers have been accepted to NAACL 2024.
- Feb. 2024: Two papers have been accepted to CVPR 2024.
- Jan. 2024: Three papers have been accepted to ICLR 2024.
- Dec. 2023: One paper has been accepted to AAAI 2024.
- Oct. 2023: One paper has been accepted to EMNLP 2023.
- Sep. 2023: Two papers have been accepted to NeurIPS 2023.
- July 2023: One paper has been accepted to ICCV 2023.
- Apr. 2023: Two papers have been accepted to ACL 2023.
EditSelected Publications [Full List]
-
LoCAL: LoRA-Contextualizing Adaptation of Large Multimodal Models for Long Document Understanding
[Paper]
Jian Chen, Ruiyi Zhang, Yufan Zhou, Tong Yu, Franck Dernoncourt, Jiuxiang Gu, Ryan A. Rossi, Changyou Chen, Tong Sun
International Conference on Learning Representations (ICLR), 2025.
-
TRINS: Towards Multimodal Language Models that Can Read
[Paper]
Ruiyi Zhang, Yanzhe Zhang, Jian Chen, Yufan Zhou, Jiuxiang Gu, Changyou Chen, Tong Sun
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
-
AutoDAN: Interpretable Gradient-Based Adversarial Attacks on Large Language Models
[Paper]
Sicheng Zhu, Ruiyi Zhang, Bang An, Gang Wu, Joe Barrow, Zichao Wang, Furong Huang, Ani Nenkova, Tong Sun
Conference on Language Modeling (COLM), 2024.
-
Towards Aligned Layout Generation via Diffusion Model with Aesthetic Constraints
[Paper]
Jian Chen, Ruiyi Zhang, Yufan Zhou, Rajiv Jain, Zhiqiang Xu, Ryan Rossi, Changyou Chen
International Conference on Learning Representations (ICLR), 2024.
-
Automatic Pseudo-Harmful Prompt Generation for Evaluating False Refusals in Large Language Models
[Paper]
Bang An, Sicheng Zhu, Ruiyi Zhang, Michael-Andrei Panaitescu-Liess, Yuancheng Xu, Furong Huang
Conference on Language Modeling (COLM), 2024.
-
Customization Assistant for Text-to-image Generation
[Paper]
Yufan Zhou, Ruiyi Zhang, Jiuxiang Gu, Tong Sun
Conference on Computer Vision and Pattern Recognition (CVPR), 2024.
-
LLaVAR: Enhanced Visual Instruction Tuning for Text-rich Image Understanding
[Paper]
Yanzhe Zhang, Ruiyi Zhang, Jiuxiang Gu, Yufan Zhou, Nedim Lipka, Diyi Yang, Tong Sun
Workshop on Instruction Tuning and Instruction Following, NeurIPS, 2023.
-
LAFITE: Towards Language-Free Training for Text-to-Image Generation
[Paper]
Yufan Zhou, Ruiyi Zhang, Changyou Chen, Chunyuan Li, Chris Tensmeyer, Tong Yu, Jiuxiang Gu, Jinhui Xu, Tong Sun
Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
-
Text-based Interactive Recommendation via Constraint Augumented Reinforcement Learning
[Paper]
Ruiyi Zhang, Tong Yu, Yilin Shen, Hongxia Jin, Changyou Chen, Lawrence Carin
Neural Information Processing Systems (NeurIPS), 2019.
-
GenDICE: Generalized Offline Estimation of Stationary Values (Oral)
[Paper]
Ruiyi Zhang, Bo Dai, Lihong Li, Dale Schuurmans
International Conference on Learning Representations (ICLR), 2020.
-
Policy Optimization as Wasswerstein Gradient Flows
[Paper]
[Poster]
[Appendix]
[Demo]
Ruiyi Zhang, Changyou Chen, Chunyuan Li, Lawrence Carin
International Conference on Machine Learning (ICML), 2018.
Abridged in NIPS 2017, Workshop on Deep Reinforcement Learning.
EditProfessional Sevices
- Senior Program Committee & Area Chair: IJCAI'21-22, AAAI'22
- Program Committee & Reviewer: AAAI'19-21, NeurIPS'19-21, ICLR'21, ACL'20-21, EMNLP'20-21, CVPR'19-21, ICCV'19-21, ICML'20-21
- Workshop on Continual and Multimodal Learning @ Ubicomp'19 and IJCAI'21
Edit
Awards & Honors
- Outstanding Graduate Student of Nanjing University, Rank 1st/143 in CS, 2016
- CCF (China Computer Federation) Outstanding Undergraduate Award 2015
- Microsoft Young Fellowship by Microsoft Research Aisa 2015
- Pacemaker of Outstanding Student of Nanjing University (30/12000+) 2014
- National Undergraduate Scholarship for consecutive three years. 2013
Miscellaneous
Outside of research, I enjoy reading, traveling and cooking. I put some of my photos on 500px .