Huijie Zhang
I am an Ph.D. student at University of Michigan, Ann Arbor, supervised by Prof. Qing Qu.
Previously, I obtained my Bachelor's degree in Mechanical Engineering
from Huazhong university of Science and Technology,
advised by Prof. Zhigang Wu; Master's degree in Mechanical Engineering and Electrical and Computer Engineering from University of Michigan, Ann Arbor, advised by Prof. Chad Jenkins
My research interests lie in generative model and diffusion model. Recently, my project is related the empirical and theoretical analysis of low-dimensional structures in diffusion model, and its applications related to training efficiency, controllable generation and privacy. My previous works are related to 3D vision, robotics manipulation and reinforcement learning.
[Updated in 10/2024]
Google Scholar  / 
Github
|
|
|
Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models
Wenda Li*,
Huijie Zhang*,
Qing Qu
In submission to ICLR, 2025
ArXiv
/Code
In this work, we propose Shallow Diffuse, utilizing the low-dimensional subsapce in diffusion model to disentangle the watermarking and image generation process. Shallow Diffuse is both emprirically and theoretically demonstrated its robustness and consistency, outperform previous watermarking techniques.
|
|
Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering
Peng Wang*,
Huijie Zhang*,
Zekai Zhang,
Siyi Chen,
Yi Ma,
Qing Qu
In submission to ICLR, 2025
ArXiv
/Code
/Website
In this work, we provide theoretical insights into the connection between diffusion model and subspace clustering. The connection shed light into the transition of diffusion model from memorization to generalization and the mechanism it breaks the curse of dimensionality.
|
|
Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing
Siyi Chen*
Huijie Zhang*,
Minzhe Guo,
Yifu Lu,
Peng Wang,
Qing Qu
NeurIPS, 2024
ArXiv
/Code
/Website
We improve the understanding of the semantic space in diffusion model and propose LOCO Edit, an editing method achieving precise and disentangled image editing without additional training. The proposed method is also supported by theoretical justification and has nice properties: homogeneity, transferability, composability, and linearity.
|
|
The Emergence of Reproducibility and Consistency in Diffusion Models
Huijie Zhang*,
Jinfan Zhou*,
Yifu Lu,
Minzhe Guo,
Peng Wang,
Liyue Shen,
Qing Qu
NeurIPS Workshop, 2023 (best paper award); ICML, 2024
ArXiv
/News
/Talk
/Code
/Website
We investigate an intriguing and prevalent phenomenon of diffusion models: given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs. And reveal its relationship with diffusion model generalizability.
|
|
Improving Training Efficiency of Diffusion Models via Multi-Stage Framework
and Tailored Multi-Decoder Architecture
Huijie Zhang*,
Yifu Lu*,
Ismail Alkhouri,
Saiprasad Ravishankar,
Dogyoon Song,
Qing Qu
CVPR, 2024
ArXiv
/Website
/Github
/Talk
In this study, we significantly enhance the training and sampling efficiency of diffusion models
through a novel multi-stage framework. This method divides the time interval into several stages,
using a specialized multi-decoder U-net architecture that combines time-specific models with a
common encoder for all stages.
|
|
TransNet: Category-Level Transparent Object Pose Estimation
Huijie Zhang,
Anthony Opipari,
Xiaotong Chen,
Jiyue Zhu,
Zeren Yu,
Odest Chadwicke Jenkins,
ECCV Workshop, 2022
ArXiv
/Website
We proposed TransNet, a two-stage pipeline that learns to estimate category-level transparent object pose using localized depth completion and surface normal estimation.
|
|
ClearPose: Large-scale Transparent Object Dataset and Benchmark
Xiaotong Chen,
Huijie Zhang,
Zeren Yu,
Anthony Opipari,
Odest Chadwicke Jenkins,
ECCV, 2022
ArXiv
/Github
/Website
We collected a large-scale transparent object dataset with RGB-D and annotated poses.
And we benchmarked transparent object depth completion and poes estimation on this dataset.
|
|
ProgressLabeller: Visual Data Stream Annotation for Training Object-Centric 3D Perception
Xiaotong Chen,
Huijie Zhang,
Zeren Yu,
Stanley Lewis,
Odest Chadwicke Jenkins,
IROS, 2022
ArXiv
/Github
/Website
ProgressLabeller is an efficient 6D pose annotation method. It is also the first open source tools compatible with transparent object.
It was implemented as a blender Add-on, more user-friendly for using.
|
|