|
Jiaxu Zhang | 张嘉旭
I am a research scientist at ByteDance Seed, focusing on AIGC and MLLM research.
I previously interned at Tencent from 2022 to 2024 and at StepFun from
2024 to 2025, where I worked closely with Dr. Gang Yu. I
received my Ph.D. degree jointly from Wuhan University and Nanyang Technological
University, under the supervision of Prof. Zhen
Dong
and Prof. Guosheng Lin.
My research interests cover computer vision and computer graphics, with a particular focus on
multimodal AIGC, 2D/3D character animation, video/motion generation, retargeting, and
recognition.
Feel free to reach out via WeChat if you are interested in research collaboration.
Email
/
CV
/
Google
Scholar /
Github
/
WeChat
|
|
News
|
[2026/05] 🎉 UniMAGE gets accepted by ICML 2026.
[2025/06] 🎉 MikuDance was accepted by ICCV
2025
(Oral).
[2025/01] 🕹️ MikuDance has recently been
launched on the Lipu,
an AI creation community designed for animation enthusiasts. Feel free to give it a try!
[2024/07] 🎉 One paper gets accepted to ACM MM
2024.
[2024/04] 🕹️ I've released a repository, Freehand-Genshin-Diffusion,
that transforms Genshin PVs into a freehand style using
the Diffusion Model. Feel free to give it a try!
[2024/04] 🎉 One paper has been accepted by IEEE
T-PAMI, which is an extension of our CVPR 2023
paper.
[2024/01] 🎉 One paper gets accepted to ICLR
2024.
[2023/06] 📌 I gave an oral presentation on Virtual
Animation Technology at VALSE 2023.
[2023/02] 🎉 One paper gets accepted to CVPR
2023.
|
Research
My research interests are broadly in 3D/2D Computer
Vision and Generative AI. My overarching research objective is to advance
AI-driven methods that augment and amplify human creativity.
|
|
FlowAct-R1: Towards Interactive Humanoid Video Generation
FlowAct Team, ByteDance Intelligent Creation
Tech Report, 2026
project
page / arxiv
We present FlowAct-R1, a novel framework that enables lifelike, responsive, and high-fidelity
humanoid video generation for seamless real-time interaction.
|
|
|
Bridging Your Imagination with Audio-Video Generation via a Unified
Director
Jiaxu Zhang, Tianshu Hu, Yuan Zhang, Zenan Li, Linjie Luo, Guosheng Lin*, Xin Chen*
Forty-Third International Conference on Machine Learning (ICML), 2026
project
page / arxiv
UniMAGE unifies script drafting, extension, continuation, and keyframe image generation, thereby
enabling coherent long-form storytelling with consistent characters and cinematic visual
compositions. The generated scripts and keyframes can further serve as structured, high-level
guidance for existing audio-video joint generation models.
|
|
DreamDance: Animating Character Art via Inpainting Stable Gaussian
Worlds
Jiaxu Zhang, Xianfang Zeng, Xin Chen,
Wei Zuo, Gang Yu*, Guosheng Lin, Zhigang Tu*
ArXiv, 2025
project
page / code / arxiv
We propose DreamDance, a novel paradigm that reformulates the character art animation task into two
inpainting based steps: Camera-aware Scene Inpainting for stable scene reconstruction and Pose-aware
Video Inpainting for dynamic character animation.
|
|
MikuDance: Animating
Character Art with Mixed Motion Dynamics
Jiaxu Zhang, Xianfang Zeng, Xin Chen,
Wei Zuo, Gang Yu*, Zhigang Tu*
Proceedings of the International Conference on Computer Vision (ICCV, Oral),
2025
project
page / code / arxiv
We propose MikuDance, a diffusion-based pipeline
incorporating mixed motion dynamics to animate
stylized character art.
|
|
Freehand-Genshin-Diffusion
A project for transforming Genshin PVs into a
freehand style using Diffusion Model.
I've been exploring 2D image animation recently. This
project is purely for fun. Feel free to reach out and
discuss this with me.
|
|
A Modular Neural Motion
Retargeting System Decoupling Skeleton and Shape
Perception
Jiaxu Zhang, Zhigang Tu*, Junwu Weng,
Junsong Yuan, Bo Du
IEEE Transactions on Pattern Analysis and Machine
Intelligence (T-PAMI), 2024
code / arxiv
M-R2ET is a modular neural motion retargeting system
designed to transfer motion between characters with
different structures but corresponding to homeomorphic
graphs, meanwhile preserving motion semantics and
perceiving shape geometries.
|
|
Generative Motion Stylization
of Cross-structure Characters within Canonical
Motion Space
Jiaxu Zhang, Xin Chen, Gang Yu, Zhigang
Tu*
Proceedings of the 32nd ACM International Conference
on Multimedia (ACM MM), 2024
arxiv
We present MotionS, a generative motion stylization
pipeline for synthesizing diverse and stylized motion
on cross-structure source using cross-modality style
prompts.
|
|
TapMo: Shape-aware Motion
Generation of Skeleton-free Characters
Jiaxu Zhang#, Shaoli Huang#, Zhigang
Tu*, Xin Chen, Xiaohang Zhan, Gang Yu, Ying Shan
The Twelfth International Conference on Learning
Representations (ICLR), 2024
project
page / code / arxiv
TapMo is a text-based animation pipeline for
generating motion in a wide variety of skeleton-free
characters.
|
|
Skinned Motion Retargeting
with Residual Perception of Motion Semantics &
Geometry
Jiaxu Zhang, Junwu Weng, Di Kang, Fang
Zhao, Shaoli Huang, Xuefei Zhe, Linchao Bao, Ying Shan,
Jue Wang, Zhigang Tu*
Proceedings of the IEEE/CVF Conference on Computer
Vision and Pattern Recognition
(CVPR), 2023
project
page / code / arxiv
R2ET is a neural motion retargeting model that can
preserve the source motion semantics and avoid
interpenetration in the target motion.
|
|
Zoom Transformer for
Skeleton-based Group Activity Recognition
Jiaxu Zhang, Yifan Jia, Wei Xie, and
Zhigang Tu*
IEEE Transactions on Circuits and Systems for Video
Technology (T-CSVT), 2022
code
/ arxiv
We propose a novel Zoom Transformer to exploit both
the low-level single-person motion information and the
high-level multi-person interaction information in a
uniform attention structure.
|
|
Joint-bone Fusion Graph
Convolutional Network for Semi-supervised Skeleton
Action Recognition
Zhigang Tu#, Jiaxu Zhang#*, Hongyan Li,
Yujin Chen, and Junsong Yuan
IEEE Transactions on Multimedia
(T-MM), 2022
code / arxiv
we propose a semi-supervised skeleton-based action
recognition method.
|
Experience
 |
ByteDance Seed
2026.01 - Present, Hangzhou
Research Scientist for AIGC and MLLM.
Advisor: Dr. Tianshu Hu and Dr. Mingyuan Gao
|
 |
ByteDance
2025.06 - 2026.01, Shenzhen
Research Intern for AIGC and MLLM.
Advisor: Dr. Xin Chen and Dr. Tianshu Hu
|
 |
StepFun
2024.05 - 2025.06, Shanghai
Research Intern for AIGC.
Advisor: Dr. Gang Yu and Dr. Xianfang
Zeng
|
 |
Tencent
2023.06 - 2024.04, Shanghai
Research Intern in Tencent PCG.
Advisor: Dr. Gang Yu and Dr. Xin
Chen
2022.07 - 2023.06, Shenzhen
Research Intern in Tencent AI Lab.
Advisor: Dr. Junwu Weng and Dr. Shaoli
Huang
|
 |
Nanyang Technological
University (NTU)
2025.02 - 2026.02, Singapore
Joint-PhD Student
Research Advisor: Prof. Guosheng Lin
|
 |
Wuhan
University
2020.09 - 2026.06, Wuhan Ph.D
Student in LIEMSARS.
I received my Master Degree of
Computer Technology in 2023.
Research Advisor: Prof. Zhigang Tu
|
 |
Southeast
University
2016.09 - 2020.06, Nanjing I received
my B.S Degree of Geographic Information
Science in 2020. GPA: 3.88/4.0, Rank: 1/26.
2018.11 - 2020.06, Nanjing
Research assistant in Research Center
of Complex Transportation Network (TLab).
|
Awards and Honors
|
2025: Academic Innovation Award of Wuhan University (15,000RMB¥,
Top 1%)
2024: NSFC Basic Research Project for Youth Scholars
(300,000RMB¥)
2023: Lei
Jun Excellence Scholarship (100,000RMB¥,
Top 0.1‰)
2023: Wang Zhizhuo Innovative Talent Award
(8,000RMB¥, Top 1%)
2022: National Scholarship (Highest
Honor for Master students in China,
10,000RMB¥, Top 3%)
2022: First-class Scholarship of Wuhan University
(5,000RMB¥, Top 10%)
2021: First-class Scholarship of Wuhan University
(5,000RMB¥, Top 10%)
2021: 1st Runner-up of ICCV
2021 MMVRAC Challenge (Track 2 and Track 3)
2020: Outstanding graduates of Southeast
University (Top 3%)
2019: Meritorious Winner - Mathematical Contest In
Modeling & Interdisciplinary Contest In Modeling,
2019
2018: National Scholarship (Highest
Honor for undergraduates in China, 8,000RMB¥,
Top 3%)
|
This homepage is designed based on Jon Barron's website and
deployed on Github
Pages. Last updated: Jun. 2026
© 2026 Jiaxu Zhang
|