You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi @yinyunie, Thanks for your amazing research and contribution to the Spatial understanding domain. I am currently stuck at a doubt while interpreting the output of the source code in correlation to the output discussed in the paper. The description of which has been summarized below. It'd be great if you can respond to it.
As far as I can see the output of the 3D object detection Network is given as bdb_3d.mat which seems like a dictionary with the following keys for each instance of object detected by the 2D object detection network.
1.'basis'
2.'coeffs'
3.'centroid'
4.'classid'
The basis seems to be the Rotational matrix of the bounding box (R 3*3) from which we can get the Euler angles in the closed subset of -pi to pi, what does the coeffs and centroid in the mat file signify ?
Please refer the cropped section 3.1 from the research paper attached below which says any 3D bounding box in the world coordinate system is defined by C,s and theta.
Which of the aforementioned keys in bdb_3d.mat correspond to C abbreviated as 3D Center and s abbreviated as spatial size ?
Thanks, anticipating a response.
The text was updated successfully, but these errors were encountered:
Suraj520
changed the title
Issue: Interpretating the attributes of 3D Bounding box
Issue: Interpreting the attributes of 3D Bounding box
Oct 15, 2020
coeffs = distance from each vertex to 3d center (centroid). Each detection and the entire layout detection has 1 R3 coefficient. Coefficient represents spatial size of bounding box.
centroid = predicted <i,j,k> center of each detection. Thus C (centroid) represents 3D center.
As you mentioned, θ represents the rotation angle (of the world coordinate system) used to create the basis.
Hi @yinyunie, Thanks for your amazing research and contribution to the Spatial understanding domain. I am currently stuck at a doubt while interpreting the output of the source code in correlation to the output discussed in the paper. The description of which has been summarized below. It'd be great if you can respond to it.
As far as I can see the output of the 3D object detection Network is given as bdb_3d.mat which seems like a dictionary with the following keys for each instance of object detected by the 2D object detection network.
1.'basis'
2.'coeffs'
3.'centroid'
4.'classid'
The basis seems to be the Rotational matrix of the bounding box (R 3*3) from which we can get the Euler angles in the closed subset of -pi to pi, what does the coeffs and centroid in the mat file signify ?
Please refer the cropped section 3.1 from the research paper attached below which says any 3D bounding box in the world coordinate system is defined by C,s and theta.
Which of the aforementioned keys in bdb_3d.mat correspond to C abbreviated as 3D Center and s abbreviated as spatial size ?
Thanks, anticipating a response.
The text was updated successfully, but these errors were encountered: