Huijie Zhang

I am an Ph.D. student at University of Michigan, Ann Arbor, supervised by Prof. Qing Qu. I am joining in Snap Inc. as a research intern, supervised by Ivan Skorokhodov, Aliaksandr Siarohin and Sergey Tulyakov.

Previously, I obtained my Bachelor's degree in Mechanical Engineering from Huazhong university of Science and Technology, advised by Prof. Zhigang Wu; Master's degree in Mechanical Engineering and Electrical and Computer Engineering from University of Michigan, Ann Arbor, advised by Prof. Chad Jenkins.

My research centers on generative models, with a particular focus on developing a rigorous theoretical understanding of diffusion models and leveraging these insights to drive practical advancements. I study their generalization behavior, interpretability, and underlying low-dimensional structures, and use these theoretical insights to enhance the efficiency, controllability, and safety of diffusion-based generation.

[Updated in 7/2025]

Google Scholar  /  Github

profile photo
News

[06/2025] I joined Snap Inc. as a research intern in the Creative Vision Team!

Publication
Understanding Generalization in Diffusion Models via Probability Flow Distance
Huijie Zhang, Zijian Huang, Siyi Chen, Jinfan Zhou, Zekai Zhang, Peng Wang, Qing Qu
ArXiv /Website

In this work, we propose probability flow distance (PFD), a theoretically grounded and computationally efficient metric to measure distributional generalization. by using PFD under a teacher-student evaluation protocol, we empirically uncover several key novel generalization behaviors in diffusion models.

Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models
Wenda Li*, Huijie Zhang*, Qing Qu
ArXiv /Code /Website

In this work, we propose Shallow Diffuse, utilizing the low-dimensional subsapce in diffusion model to disentangle the watermarking and image generation process. Shallow Diffuse is both emprirically and theoretically demonstrated its robustness and consistency, outperform previous watermarking techniques.

Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering
Peng Wang*, Huijie Zhang*, Zekai Zhang, Siyi Chen, Yi Ma, Qing Qu
ArXiv /Code /Website

In this work, we provide theoretical insights into the connection between diffusion model and subspace clustering. The connection shed light into the transition of diffusion model from memorization to generalization and the mechanism it breaks the curse of dimensionality.

Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing
Siyi Chen* Huijie Zhang*, Minzhe Guo, Yifu Lu, Peng Wang, Qing Qu
NeurIPS, 2024
ArXiv /Code /Website

We improve the understanding of the semantic space in diffusion model and propose LOCO Edit, an editing method achieving precise and disentangled image editing without additional training. The proposed method is also supported by theoretical justification and has nice properties: homogeneity, transferability, composability, and linearity.

The Emergence of Reproducibility and Consistency in Diffusion Models
Huijie Zhang*, Jinfan Zhou*, Yifu Lu, Minzhe Guo,
Peng Wang, Liyue Shen, Qing Qu
NeurIPS Workshop, 2023 (best paper award); ICML, 2024
ArXiv /News /Talk /Code /Website

We investigate an intriguing and prevalent phenomenon of diffusion models: given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs. And reveal its relationship with diffusion model generalizability.

Improving Training Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architecture
Huijie Zhang*, Yifu Lu*, Ismail Alkhouri, Saiprasad Ravishankar,
Dogyoon Song, Qing Qu
CVPR, 2024
ArXiv /Website /Github /Talk

In this study, we significantly enhance the training and sampling efficiency of diffusion models through a novel multi-stage framework. This method divides the time interval into several stages, using a specialized multi-decoder U-net architecture that combines time-specific models with a common encoder for all stages.

TransNet: Category-Level Transparent Object Pose Estimation
Huijie Zhang, Anthony Opipari, Xiaotong Chen,
Jiyue Zhu, Zeren Yu, Odest Chadwicke Jenkins,
ECCV Workshop, 2022
ArXiv /Website

We proposed TransNet, a two-stage pipeline that learns to estimate category-level transparent object pose using localized depth completion and surface normal estimation.

ClearPose: Large-scale Transparent Object Dataset and Benchmark
Xiaotong Chen, Huijie Zhang, Zeren Yu,
Anthony Opipari, Odest Chadwicke Jenkins,
ECCV, 2022
ArXiv /Github /Website

We collected a large-scale transparent object dataset with RGB-D and annotated poses. And we benchmarked transparent object depth completion and poes estimation on this dataset.

ProgressLabeller: Visual Data Stream Annotation for Training Object-Centric 3D Perception
Xiaotong Chen, Huijie Zhang, Zeren Yu,
Stanley Lewis, Odest Chadwicke Jenkins,
IROS, 2022
ArXiv /Github /Website

ProgressLabeller is an efficient 6D pose annotation method. It is also the first open source tools compatible with transparent object. It was implemented as a blender Add-on, more user-friendly for using.



This website has been inspired by Jon Barron.