We ablate with or without explicitly considering hand pose. The comparison suggests that hand information provides complementary cue to visual inputs to alleviate depth ambguity.
Input
Ground Truth
W/O Articulation
Our Method
Paper
Bibtex
@inproceedings{ye2022hand,
author = {Ye, Yufei
and Gupta, Abhinav
and Tulsiani, Shubham},
title = {What's in your hands?3D Reconstruction of Generic Objects in Hands},
booktitle = {CVPR},
year={2022}
}