Haoyi Zhu
Haoyi Zhu
Home
Featured
Publications
Experience
Gallery
Poems
Contact
Light
Dark
Automatic
Computer Vision
SPA: 3D Spatial-Awareness Enables Effective Embodied Representation
A novel representation learning framework that emphasizes the importance of 3D spatial awareness in embodied AI.
Haoyi Zhu 朱皓怡
,
Honghui Yang
,
Yating Wang
,
Jiange Yang
,
Limin Wang
,
Tong He
Cite
Project
PDF
Arxiv
Twitter
Code
HuggingFace Model
RealWorld Code
PonderV2: Pave the Way for 3D Foundation Model with A Universal Pre-training Paradigm
A general 3D pre-training approach establishing a pathway to 3D foundational models.
Haoyi Zhu 朱皓怡
,
Honghui Yang
,
Xiaoyang Wu
,
Di Huang
,
Sha Zhang
,
Xianglong He
,
Tong He
,
Hengshuang Zhao
,
Chunhua Shen
,
Yu Qiao
,
Wanli Ouyang
Cite
Github
PDF
Arxiv
UniPAD: A Universal Pre-training Paradigm for Autonomous Driving
Abstract: In the context of autonomous driving, the significance of effective feature learning is widely acknowledged. While conventional 3D self-supervised pre-training methods have shown widespread success, most methods follow the ideas originally designed for 2D images.
Honghui Yang
,
Sha Zhang
,
Di Huang
,
Xiaoyang Wu
,
Haoyi Zhu 朱皓怡
,
Tong He
,
Shixiang Tang
,
Hengshuang Zhao
,
Qibo Qiu
,
Binbin Lin
,
Xiaofei He
,
Wanli Ouyang
Cite
Github
PDF
Arxiv
AlphaTracker: a multi-animal tracking and behavioral analysis tool
Abstract: Computer vision has emerged as a powerful tool to elevate behavioral research. This protocol describes a computer vision machine learning pipeline called AlphaTracker, which has minimal hardware requirements and produces reliable tracking of multiple unmarked animals, as well as behavioral clustering.
Zexin Chen
,
Ruihan Zhang
,
Hao-Shu Fang
,
Yu E. Zhang
,
Aneesh Bal
,
Haowen Zhou
,
Rachel R. Rock
,
Nancy Padilla-Coreano
,
Laurel R. Keyes
,
Haoyi Zhu 朱皓怡
,
Yong-Lu Li
,
Takaki Komiyama
,
Kay M. Tye
,
Cewu Lu
Cite
Paper
Code
AlphaPose: Whole-Body Regional Multi-Person Pose Estimation and Tracking in Real-Time
An accurate real-time multi-person pose estimator and tracker.
Hao-Shu Fang
,
Jiefeng Li
,
Hongyang Tang
,
Chao Xu
,
Haoyi Zhu 朱皓怡
,
Yuliang Xiu
,
Yong-Lu Li
,
Cewu Lu
Cite
Project
PDF
Arxiv
Github
Model Zoo
Halpe Dataset
X-NeRF: Explicit Neural Radiance Field for Multi-Scene 360 Insufficient RGB-D Views
Abstract: Neural Radiance Fields (NeRFs), despite their outstanding performance on novel view synthesis, often need dense input views. Many papers train one model for each scene respectively and few of them explore incorporating multimodal data into this problem.
Haoyi Zhu 朱皓怡
,
Hao-Shu Fang
,
Cewu Lu
Cite
PDF
Arxiv
Code
MineDojo: Building Open-Ended Embodied Agents with Internet-Scale Knowledge
Building open-ended agents with internet-scale knowledge in Minecraft.
Linxi Fan
,
Guanzhi Wang
,
Yunfan Jiang
,
Ajay Mandlekar
,
Yuncong Yang
,
Haoyi Zhu 朱皓怡
,
Andrew Tang
,
De-An Huang
,
Yuke Zhu
,
Anima Anandkumar
Cite
Project
PDF
Arxiv
Twitter
Code
Database
Blog
Video
Unsupervised Multi-Task Learning for 3D Subtomogram Image Alignment, Clustering and Segmentation
Abstract: 3D subtomogram image alignment, clustering, and segmentation are vital to macromolecular structure recognition in cryo-electron tomography (cryo-ET). However, acquiring ground-truth labels to train a unified deep learning model that can simultaneously deal with these tasks is unaffordable.
Haoyi Zhu 朱皓怡
,
Chuting Wang
,
Yuanxin Wang
,
Zhaoxin Fan
,
Mostofa Rafid Uddin
,
Xin Gao
,
Jing Zhang
,
Xiangrui Zeng
,
Min Xu
Cite
PDF
DOI
ICIP 2022
Cite
×