I am Zhiyuan Liang, who just graduated from USTC.

I’m extraodinarily fortunate to work as an intern at NUS HPC AI Lab for a fruitful year, under the supervision of Prof.Yang You, and advised by Dr. Kai Wang and Dr. Wangbo Zhao. Before that, I worked as an intern at UNC Chapel Hill under the supervision of Prof Huaxiu Yao . Before that, I was fortunate to start my journey of research at the Lab of Data Science supervised by Prof.Xiang Wang and Prof.Xiangnan He.

My research interest lies at the intersection of Large Language Models and Multimodal Understanding. I am also trying to obtain higher level of intelligence from the angle of weight space learning, and explore the unified learning paradigm across various modalities. I’m actively seeking for PhD opportuninties.

🔥 News

2025.09: 🥳 DnD and other 2 papers accpeted to NeurIPS 2025! Thanks all collaborators!
2025.07: 🌟 Our new work, Drag-and-Drop LLMs, customizes LLMs in seconds without tuning! Check our paper and code!
2025.06: 🎉 I received bachelor degree from USTC!

📝 Selected Publications

NeurIPS 2025

Drag-and-Drop LLMs: Zero-Shot Prompt-to-Weights

Zhiyuan Liang†, Dongwen Tang, Yuhao Zhou, Xuanlei Zhao, Mingjia Shi，

Wangbo Zhao, Zekai Li, Peihao Wang, Konstantin Schürholt, Damian Borth

Michael M. Bronstein, Yang You, Zhangyang Wang†, Kai Wang† († project lead)

We introduce Drag-and-Drop LLMs (DnD) 🥳, a prompt-conditioned parameter generator that enables training-free adaptation of large language models. It features:

Producing task-specific LoRA matrices from unlabeled task prompts.
Generating weights for novel tasks in seconds, achieving up to 12,000× lower overhead.
Outperforming the strongest training LoRAs by up to 30% on various zero-shot benchmarks.

[paper] [code] [abstract]

Preprint

Dynamic Vision Mamba

Mengxuan Wu*, Zekai Li*†, Zhiyuan Liang*, Moyang Li, Xuanlei Zhao, Samir Khaki, Zheng Zhu, Xiaojiang Peng, Konstantinos N. Plataniotis, Kai Wang‡, Wangbo Zhao‡, Yang You (* equal contribution, † project lead, ‡ corresponding author)

We introduce Dynamic Vision Mamba (DyVM) 🚀, a dynamic inference framework for Mamba-based vision models that significantly reduces computation while preserving performance. It features:

Token-level efficiency: Customized token pruning with sequence rearrangement to maintain consistency between training and inference.
Block-level adaptivity: Dynamic selection of SSM blocks per image, reducing redundancy based on input complexity.
Strong efficiency-accuracy trade-off: Achieves 35.2% FLOPs reduction with only 1.7% accuracy drop on Vim-S, and generalizes across architectures and vision tasks.

[paper] [code] [abstract]

NeurIPS 2025

REPA Works Until It Doesn’t: Early-Stopped, Holistic Alignment Supercharges Diffusion Training

Ziqiao Wang∗, Wangbo Zhao∗, Yuhao Zhou, Zekai Li, Zhiyuan Liang, Mingjia Shi, Xuanlei Zhao, Pengfei Zhou, Kaipeng Zhang†, Zhangyang Wang, Kai Wang†, Yang You (* equal contribution, † corresponding author)

Representation alignment (REPA) that matches Diffusion Transformer (DiT) hidden features to a self-supervised encoder (e.g. DINO)—dramatically accelerates the early epochs but plateaus or even de grades performance later. We trace this failure to a capacity mismatch in gradient directions of repsentation and denoising task, and introduce HASTE (Holistic Alignment with Stage-wise Termination for Efficient training), a two-phase DiT training schedule that keeps the help and drops the hindrance. On ImageNet 256×256, it a 28× reduction in optimization steps. HASTE also improves text-to-image DiTs on MS-COCO, demonstrating to be a simple yet principled recipe for efficient diffusion training across various tasks.

[paper] [code] [abstract]

ICLR 2025

Cream: Consistency regularized self-rewarding language models

Zhaoyang Wang, Weilei He, Zhiyuan Liang, Xuchao Zhang, Chetan Bansal, Ying Wei, Weitong Zhang, Huaxiu Yao

Consistency Regularized sElf-rewarding lAnguage Model (CREAM) is a self-rewarding framework that improves LLM alignment without human-labeled preference data. It addresses the key issue of reward bias in iterative self-training by:

Formulating a generalized iterative preference fine-tuning framework with explicit consistency regularization.
Leveraging reward stability across iterations to produce more reliable preference labels.
Achieving superior alignment performance and higher reward consistency, even as smaller LLMs (e.g., 7B) face diminishing returns from standard self-rewarding.

[paper] [code] [abstract]

📖 Educations

2021.09 - 2025.06, Bachelor Degree in Artifical Intelligence’s Talent Class, University of Science and Technology of China.

💻 Internships

2023.03 - 2024.06, University of Science and Technology of China, Undergraduate Research Intern. Mentor: Xiang Wang, Xiangnan He.
2024.05 - 2024.10, University of North Carolina at Chapel Hill, Research Intern. Mentor: Huaxiu Yao.
2024.08 - 2025.08, National University of Singapore, Research Intern. Mentor: Yang You. Advisor: Kaiwang, Wangbo Zhao.