Yuanlu Xu | AI Research Scientist

2025

MHR: Momentum Human Rig

A Ferguson, A Osman, B Bescos, C Stoll, C Twigg, C Lassner, ..., Y Xu, Y Ye, Z Jiang

arXiv 2025

arXiv Code

2025

Generating High-Fidelity Clothed Human Dynamics with Temporal Diffusion

S Zou, Y Xu, N Sarafianos, F Bogo, T Tung, W Si, L Cheng

ACM TOMM 2025

Paper

2025

DGH: Dynamic Gaussian Hair

J Wang, Y Xu, E Tretschk, Z Wang, A Ianina, A Bozic, U Neumann, T Tung

NeurIPS 2025

Project arXiv Paper

2024

RoHM: Robust Human Motion Reconstruction via Diffusion

S Zhang, BL Bhatnagar, Y Xu, A Winkler, P Kadlecek, S Tang, F Bogo

CVPR 2024

Project arXiv Paper Code

2024

ANIM: Accurate Neural Implicit Model for Human Reconstruction from a Single RGB-D Image

M Pesavento, Y Xu, N Sarafianos, R Maier, Z Wang, C Yao, M Volino, E Boyer, A Hilton, T Tung

CVPR 2024

Project arXiv Paper

2024

HISR: Hybrid Implicit Surface Representation for Photorealistic 3D Human Reconstruction

A Wang, Y Xu, N Sarafianos, R Maier, E Boyer, A Yuille, T Tung

AAAI 2024

arXiv Paper

2023

VIVE3D: Viewpoint-Independent Video Editing using 3D-Aware GANs

A Frühstück, N Sarafianos, Y Xu, P Wonka, T Tung

CVPR 2023

Project arXiv Paper Code

2023

Snipper: A Spatiotemporal Transformer for Simultaneous Multi-Person 3D Pose Estimation, Tracking and Forecasting

S Zou, Y Xu, C Li, L Ma, L Cheng, M Vo

IEEE TCSVT 2023

arXiv Paper Code

2023

NSF: Neural Surface Fields for Human Modeling from Monocular Depth

Y Xue, BL Bhatnagar, R Marin, N Sarafianos, Y Xu, G Pons-Moll, T Tung

ICCV 2023

Project arXiv Paper Code

2023

Multi-View Reconstruction using Signed Ray Distance Functions (SRDF)

P Zins, Y Xu, E Boyer, S Wuhrer, T Tung

CVPR 2023

arXiv Paper

2022

Multiview Human Body Reconstruction from Uncalibrated Cameras

Z Yu, L Zhang, Y Xu, C Tang, L Tran, C Keskin, HS Park

NeurIPS 2022

Paper

2022

BodyMap: Learning Full-Body Dense Correspondence Map

A Ianina, N Sarafianos, Y Xu, I Rocco, T Tung

CVPR 2022

Project arXiv Paper

2021

ARCH++: Animation-Ready Clothed Human Reconstruction Revisited

T He, Y Xu, S Saito, S Soatto, T Tung

ICCV 2021

arXiv Paper

2021

Monocular 3D Pose Estimation via Pose Grammar and Data Augmentation

Y Xu, W Wang, T Liu, X Liu, J Xie, SC Zhu

IEEE TPAMI

Paper

2021

Data-Driven 3D Reconstruction of Dressed Humans from Sparse Views

P Zins, Y Xu, E Boyer, S Wuhrer, T Tung

3DV 2021

Project arXiv Paper Code

2021

UNOC: Understanding Occlusion for Embodied Presence in Virtual Reality

M Parger, C Tang, Y Xu, C Twigg, L Tao, Y Li, R Wang, M Steinberger

IEEE TVCG

arXiv Paper Code

2020

ARCH: Animatable Reconstruction of Clothed Humans

Z Huang, Y Xu, C Lassner, H Li, T Tung

CVPR 2020

Project arXiv Paper Code

2019

DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare

Y Xu, SC Zhu, T Tung

ICCV 2019

arXiv Paper

2018

Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification

W Wang*, Y Xu*, J Shen, SC Zhu

CVPR 2018

Paper Code

2018

Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image

S Huang, S Qi, Y Zhu, Y Xiao, Y Xu, SC Zhu

ECCV 2018

Project arXiv Paper Code

2018

A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects

Y Xu, L Qin, X Liu, J Xie, SC Zhu

CVPR 2018

Paper

2018

Learning Pose Grammar to Encode Human Body Configuration for 3D Pose Estimation

H Fang*, Y Xu*, W Wang, X Liu, SC Zhu

AAAI 2018

arXiv Paper

2018

Scene-Centric Joint Parsing of Cross-View Videos

H Qi*, Y Xu*, T Yuan*, T Wu, SC Zhu

AAAI 2018

arXiv Paper

2017

Cross-View People Tracking by Scene-Centered Spatio-Temporal Parsing

Y Xu, X Liu, L Qin, SC Zhu

AAAI 2017

arXiv Paper

2017

Place-Centric Visual Urban Perception with Deep Multi-Instance Regression

X Liu, Q Chen, L Zhu, Y Xu, L Lin

ACM MM 2017

Paper

2017

A Stochastic Attribute Grammar for Robust Cross-View Human Tracking

X Liu, Y Xu, L Zhu, Y Mu

IEEE TCSVT

Paper

2016

Multi-View People Tracking via Hierarchical Trajectory Composition

Y Xu, X Liu, Y Liu, SC Zhu

CVPR 2016

Paper

2014

Person Search in a Scene by Jointly Modeling People Commonness and Person Uniqueness

Y Xu, B Ma, R Huang, L Lin

ACM MM 2014

Paper

2014

Complex Background Subtraction by Pursuing Dynamic Spatio-Temporal Models

L Lin, Y Xu, X Liang, JH Lai

IEEE TIP

arXiv Paper

2013

Human Re-Identification by Matching Compositional Template with Cluster Sampling

Y Xu, L Lin, WS Zheng, X Liu

ICCV 2013

arXiv Paper

2012

Realtime Object-of-Interest Tracking by Learning Composite Patch-Based Templates

Y Xu, H Zhou, Q Wang, L Lin

ICIP 2012

Paper

Research Interests

3D Vision & Graphics

Digital Avatars

CV / ML

Generative Models

VLM / MLLM

Publications

MHR: Momentum Human Rig

Generating High-Fidelity Clothed Human Dynamics with Temporal Diffusion

DGH: Dynamic Gaussian Hair

RoHM: Robust Human Motion Reconstruction via Diffusion

ANIM: Accurate Neural Implicit Model for Human Reconstruction from a Single RGB-D Image

HISR: Hybrid Implicit Surface Representation for Photorealistic 3D Human Reconstruction

VIVE3D: Viewpoint-Independent Video Editing using 3D-Aware GANs

Snipper: A Spatiotemporal Transformer for Simultaneous Multi-Person 3D Pose Estimation, Tracking and Forecasting

NSF: Neural Surface Fields for Human Modeling from Monocular Depth

Multi-View Reconstruction using Signed Ray Distance Functions (SRDF)

Multiview Human Body Reconstruction from Uncalibrated Cameras

BodyMap: Learning Full-Body Dense Correspondence Map

ARCH++: Animation-Ready Clothed Human Reconstruction Revisited

Monocular 3D Pose Estimation via Pose Grammar and Data Augmentation

Data-Driven 3D Reconstruction of Dressed Humans from Sparse Views

UNOC: Understanding Occlusion for Embodied Presence in Virtual Reality

ARCH: Animatable Reconstruction of Clothed Humans

DenseRaC: Joint 3D Pose and Shape Estimation by Dense Render-and-Compare

Attentive Fashion Grammar Network for Fashion Landmark Detection and Clothing Category Classification

Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image

A Causal And-Or Graph Model for Visibility Fluent Reasoning in Tracking Interacting Objects

Learning Pose Grammar to Encode Human Body Configuration for 3D Pose Estimation

Scene-Centric Joint Parsing of Cross-View Videos

Cross-View People Tracking by Scene-Centered Spatio-Temporal Parsing

Place-Centric Visual Urban Perception with Deep Multi-Instance Regression

A Stochastic Attribute Grammar for Robust Cross-View Human Tracking

Multi-View People Tracking via Hierarchical Trajectory Composition

Person Search in a Scene by Jointly Modeling People Commonness and Person Uniqueness

Complex Background Subtraction by Pursuing Dynamic Spatio-Temporal Models

Human Re-Identification by Matching Compositional Template with Cluster Sampling

Realtime Object-of-Interest Tracking by Learning Composite Patch-Based Templates

Experience

AI Research Scientist

Research Scientist

Ph.D. in Computer Science

Contact