About

I am a master's student at the School of Computer Science, Fudan University, working on visual generation and multimodal learning.

Previously, I obtained my bachelor's degree in Internet of Things Engineering from the Faculty of Computing, Harbin Institute of Technology. My research focuses on core problems in cutting-edge computer vision, including diffusion-based generation, audio-driven portrait animation, temporally coherent video restoration, and human-centric video generation.

Visual Generation Multimodal Learning Portrait Animation Video Restoration Human-Centric Video

Selected Publications

ICML 2026 · Visual Generation

Prompt Reinjection: Alleviating Prompt Forgetting in Multimodal Diffusion Transformers

A training-free inference framework that mitigates semantic attenuation across deep layers in multimodal diffusion transformers.

SIGGRAPH Asia 2025 · Portrait Animation

Hallo4: High-Fidelity Dynamic Portrait Animation via Direct Preference Optimization

An audio-driven portrait animation model using direct preference optimization for temporally coherent and realistic dynamic video portraits.

ICCV 2025 · Video Restoration

DicFace: Dirichlet-Constrained Variational Codebook Learning for Temporally Coherent Video Face Restoration

A video face restoration method based on Dirichlet-constrained variational codebook mapping for improved temporal smoothness and consistency.

CVPR 2025 · Dataset

OpenHumanVid: A Large-Scale High-Quality Dataset for Enhancing Human-Centric Video Generation

A large-scale dataset designed to improve domain accuracy and structural tracking for high-fidelity human-centric video synthesis.

Education

Present

Fudan University

Master's student, School of Computer Science.

B.Eng.

Harbin Institute of Technology

Internet of Things Engineering, Faculty of Computing.

Contact

For research discussion or collaboration, please feel free to reach me by email.