About
I am a master's student at the School of Computer Science, Fudan University, working on visual generation
and multimodal learning.
Previously, I obtained my bachelor's degree in Internet of Things Engineering from the Faculty of
Computing, Harbin Institute of Technology. My research focuses on core problems in cutting-edge computer
vision, including diffusion-based generation, audio-driven portrait animation, temporally coherent video
restoration, and human-centric video generation.
Visual Generation
Multimodal Learning
Portrait Animation
Video Restoration
Human-Centric Video
Selected Publications
ICML 2026 · Visual Generation
Prompt Reinjection: Alleviating Prompt Forgetting in Multimodal Diffusion Transformers
A training-free inference framework that mitigates semantic attenuation across deep layers in
multimodal diffusion transformers.
SIGGRAPH Asia 2025 · Portrait Animation
Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization
An audio-driven portrait animation model using direct preference optimization for temporally coherent
and realistic dynamic video portraits.
ICCV 2025 · Video Restoration
DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration
A video face restoration method based on Dirichlet-constrained variational codebook mapping for
improved temporal smoothness and consistency.
CVPR 2025 · Dataset
OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation
A large-scale dataset designed to improve domain accuracy and structural tracking for high-fidelity
human-centric video synthesis.
Education
Present
Fudan University
Master's student, School of Computer Science.
B.Eng.
Harbin Institute of Technology
Internet of Things Engineering, Faculty of Computing.