Huijie Zhang

I am an Ph.D. student at University of Michigan, Ann Arbor, supervised by Prof. Qing Qu.

Previously, I obtained my Bachelor's degree in Mechanical Engineering from Huazhong university of Science and Technology, advised by Prof. Zhigang Wu; Master's degree in Mechanical Engineering and Electrical and Computer Engineering from University of Michigan, Ann Arbor, advised by Prof. Chad Jenkins

My research interests lie in generative model and diffusion model. Recently, my project is related the empirical and theoretical analysis of low-dimensional structures in diffusion model, and its applications related to training efficiency, controllable generation and privacy. My previous works are related to 3D vision, robotics manipulation and reinforcement learning.

[Updated in 10/2024]

Google Scholar / Github

News

[09/2024] Our work on LOCO Edit was accepted by NeurIPS 2024!

[05/2024] Our work on Diffusion Model Reproducibility was accepted by ICML 2024!

[02/2024] Our work on Multi-stage Diffusion Model was accepted by CVPR 2024!

[10/2023] Our work on Diffusion Model Reproducibility was accepted by NeurIPS2023 Workshop, received best paper award !

[08/2022] Our work on TransNet was accepted by ECCV 2022 Workshop!

[07/2022] Our work on Clearpose was accepted by ECCV 2022!

[06/2022] Our work on Progresslabeller was accepted by IROS 2022!

Publication

	Shallow Diffuse: Robust and Invisible Watermarking through Low-Dimensional Subspaces in Diffusion Models Wenda Li, Huijie Zhang, Qing Qu In submission to ICLR, 2025 ArXiv /Code In this work, we propose Shallow Diffuse, utilizing the low-dimensional subsapce in diffusion model to disentangle the watermarking and image generation process. Shallow Diffuse is both emprirically and theoretically demonstrated its robustness and consistency, outperform previous watermarking techniques.
	Diffusion Models Learn Low-Dimensional Distributions via Subspace Clustering Peng Wang, Huijie Zhang, Zekai Zhang, Siyi Chen, Yi Ma, Qing Qu In submission to ICLR, 2025 ArXiv /Code /Website In this work, we provide theoretical insights into the connection between diffusion model and subspace clustering. The connection shed light into the transition of diffusion model from memorization to generalization and the mechanism it breaks the curse of dimensionality.
	Exploring Low-Dimensional Subspaces in Diffusion Models for Controllable Image Editing Siyi Chen* Huijie Zhang, Minzhe Guo, Yifu Lu, Peng Wang, Qing Qu NeurIPS, 2024 ArXiv /Code /Website We improve the understanding of the semantic space in diffusion model and propose LOCO Edit*, an editing method achieving precise and disentangled image editing without additional training. The proposed method is also supported by theoretical justification and has nice properties: homogeneity, transferability, composability, and linearity.
	The Emergence of Reproducibility and Consistency in Diffusion Models Huijie Zhang, Jinfan Zhou, Yifu Lu, Minzhe Guo, Peng Wang, Liyue Shen, Qing Qu NeurIPS Workshop, 2023 (best paper award); ICML, 2024 ArXiv /News /Talk /Code /Website We investigate an intriguing and prevalent phenomenon of diffusion models: given the same starting noise input and a deterministic sampler, different diffusion models often yield remarkably similar outputs. And reveal its relationship with diffusion model generalizability.
	Improving Training Efficiency of Diffusion Models via Multi-Stage Framework and Tailored Multi-Decoder Architecture Huijie Zhang, Yifu Lu, Ismail Alkhouri, Saiprasad Ravishankar, Dogyoon Song, Qing Qu CVPR, 2024 ArXiv /Website /Github /Talk In this study, we significantly enhance the training and sampling efficiency of diffusion models through a novel multi-stage framework. This method divides the time interval into several stages, using a specialized multi-decoder U-net architecture that combines time-specific models with a common encoder for all stages.
	TransNet: Category-Level Transparent Object Pose Estimation Huijie Zhang, Anthony Opipari, Xiaotong Chen, Jiyue Zhu, Zeren Yu, Odest Chadwicke Jenkins, ECCV Workshop, 2022 ArXiv /Website We proposed TransNet, a two-stage pipeline that learns to estimate category-level transparent object pose using localized depth completion and surface normal estimation.
	ClearPose: Large-scale Transparent Object Dataset and Benchmark Xiaotong Chen, Huijie Zhang, Zeren Yu, Anthony Opipari, Odest Chadwicke Jenkins, ECCV, 2022 ArXiv /Github /Website We collected a large-scale transparent object dataset with RGB-D and annotated poses. And we benchmarked transparent object depth completion and poes estimation on this dataset.
	ProgressLabeller: Visual Data Stream Annotation for Training Object-Centric 3D Perception Xiaotong Chen, Huijie Zhang, Zeren Yu, Stanley Lewis, Odest Chadwicke Jenkins, IROS, 2022 ArXiv /Github /Website ProgressLabeller is an efficient 6D pose annotation method. It is also the first open source tools compatible with transparent object. It was implemented as a blender Add-on, more user-friendly for using.

Selected Project

Deep Q-learning from demonstration on Minecraft
Huijie Zhang, Phil Kangle Mu, Ying Jiang, Sihang Wei,
Github

This Project useed Deep Q-learning from demonstration to teach agent cutting trees in Minecraft Environment

This website has been inspired by Jon Barron.