Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue: Interpreting the attributes of 3D Bounding box #20

Open
Suraj520 opened this issue Oct 15, 2020 · 1 comment
Open

Issue: Interpreting the attributes of 3D Bounding box #20

Suraj520 opened this issue Oct 15, 2020 · 1 comment

Comments

@Suraj520
Copy link

Suraj520 commented Oct 15, 2020

Hi @yinyunie, Thanks for your amazing research and contribution to the Spatial understanding domain. I am currently stuck at a doubt while interpreting the output of the source code in correlation to the output discussed in the paper. The description of which has been summarized below. It'd be great if you can respond to it.

As far as I can see the output of the 3D object detection Network is given as bdb_3d.mat which seems like a dictionary with the following keys for each instance of object detected by the 2D object detection network.
1.'basis'
2.'coeffs'
3.'centroid'
4.'classid'

The basis seems to be the Rotational matrix of the bounding box (R 3*3) from which we can get the Euler angles in the closed subset of -pi to pi, what does the coeffs and centroid in the mat file signify ?

Please refer the cropped section 3.1 from the research paper attached below which says any 3D bounding box in the world coordinate system is defined by C,s and theta.

Which of the aforementioned keys in bdb_3d.mat correspond to C abbreviated as 3D Center and s abbreviated as spatial size ?

Thanks, anticipating a response.

3DObjectDetection

@Suraj520 Suraj520 changed the title Issue: Interpretating the attributes of 3D Bounding box Issue: Interpreting the attributes of 3D Bounding box Oct 15, 2020
@alando46
Copy link

alando46 commented Mar 9, 2021

coeffs = distance from each vertex to 3d center (centroid). Each detection and the entire layout detection has 1 R3 coefficient. Coefficient represents spatial size of bounding box.

centroid = predicted <i,j,k> center of each detection. Thus C (centroid) represents 3D center.

As you mentioned, θ represents the rotation angle (of the world coordinate system) used to create the basis.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants