-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Body pose looks misaligned in EgoExo4D #331
Comments
The 2d key points are in undistorted image space relative the camera intrinsics released with Egopose. cc @suyogduttjain |
That's helpful. I've tried using OpenCV's 4-parameter fisheye camera model for interpreting the distortion coefficients, but the results don't seem aligned. What camera model should we use? Edit: based on https://github.com/facebookresearch/Ego4d/blob/6056f8deac0cea8d8d2caad2f55995506941156c/ego4d/internal/human_pose/undistort_to_halo.py it does look like opencv fisheye. Might it be that the released intrinsic matrix is the "new_K" after undistortion? That may explain why projecting the 3D annotation accroding to the camera parameters (including distortion coeffs) gives wrong alignment. More broadly, it would be helpful to have an example of visualizing the pose. Edit 2: For the basketball sequence, I get correct alignment after dividing the focal length by 0.7. However, the factor seems to be different for each video. |
Hi, This is our challenge/baseline repository: https://github.com/EGO4D/ego-exo4d-egopose It contains a lot of useful information on data loading, preparation and how to work with this data. |
That is unfortunately only about the Aria data, not Exo data. I still suspect the released intrinsics are not correct for Exo. It seems that the released intrinsic matrix in the I get exact alignment between the released 2D coords and when I project the released 3D coords with the released intrinsics and extrinsics while ignoring the distortion coeffs. Probably therefore the released intrinsics are the new_K, as output by
There's no simple formula to invert import cv2
import scipy.optimize
def get_orig_intrinsic_matrix(released_intrinsic_matrix, distortion_coeffs):
size = (int(released_intrinsic_matrix[0,2]*2), int(released_intrinsic_matrix[1,2]*2))
orig_intr = released_intrinsic_matrix.copy()
def objective(focal):
orig_intr[0,0] = focal
orig_intr[1,1] = focal
new_K = cv2.fisheye.estimateNewCameraMatrixForUndistortRectify(orig_intr, distortion_coeffs, size, np.eye(3), balance=0.8)
return (new_K[0,0] - released_intrinsic_matrix[0,0])**2
optimal_focal = scipy.optimize.minimize_scalar(objective, bounds=(100, 5000), method='bounded', options=dict(xatol=1e-4)).x
orig_intr[0,0] = optimal_focal
orig_intr[1,1] = optimal_focal
return orig_intr Projecting with the resulting parameters gives the correct alignment: In summary, the original intrinsic matrices for exo cameras would be useful to have as well. |
Exo intrinsics have been released in the capture_traj_dir = os.path.join(RELEASE_DIR, take["capture"]["root_dir"], "trajectory")
assert os.path.exists(capture_traj_dir)
gopro_calibs_df = pd.read_csv(os.path.join(capture_traj_dir, "gopro_calibs.csv"))
calib_df = gopro_calibs_df[gopro_calibs_df.cam_uid == cam_id]
D, I = get_distortion_and_intrinsics(calib_df.iloc[0].to_dict()) The intrinsics did change over time due to updates to the MPS algorithm during development. So the intrinsics used for annotations may be different. I did a basic check and for the GoPro intrinsics it appears to be the same. I will let @suyogduttjain comment on this as he lead our annotation effort for body/hand pose & pushed these benchmarks to the finish line for the release. For the above referenced function: def undistort_exocam(image, intrinsics, distortion_coeffs, dimension = (3840, 2160)):
DIM=dimension
dim2=None
dim3=None
balance=0.8
# Load the distortion parameters
distortion_coeffs = distortion_coeffs
# Load the camera intrinsic parameters
intrinsics = intrinsics
dim1 = image.shape[:2][::-1] #dim1 is the dimension of input image to un-distort
# Change the calibration dim dynamically (bouldering cam01 and cam04 are verticall for examples)
if DIM[0] != dim1[0]:
DIM = (DIM[1], DIM[0])
assert dim1[0]/dim1[1] == DIM[0]/DIM[1], "Image to undistort needs to have same aspect ratio as the ones used in calibration"
if not dim2:
dim2 = dim1
if not dim3:
dim3 = dim1
scaled_K = intrinsics * dim1[0] / DIM[0] # The values of K is to scale with image dimension.
scaled_K[2][2] = 1.0 # Except that K[2][2] is always 1.0
# This is how scaled_K, dim2 and balance are used to determine the final K used to un-distort image. OpenCV document failed to make this clear!
new_K = cv2.fisheye.estimateNewCameraMatrixForUndistortRectify(scaled_K, distortion_coeffs, dim2, np.eye(3), balance=balance)
map1, map2 = cv2.fisheye.initUndistortRectifyMap(scaled_K, distortion_coeffs, np.eye(3), new_K, dim3, cv2.CV_16SC2)
undistorted_image = cv2.remap(image, map1, map2, interpolation=cv2.INTER_LINEAR, borderMode=cv2.BORDER_CONSTANT)
return undistorted_image, new_K
def get_distortion_and_intrinsics(_raw_camera):
intrinsics = np.array(
[
[_raw_camera['intrinsics_0'], 0, _raw_camera['intrinsics_2']],
[0, _raw_camera['intrinsics_1'], _raw_camera['intrinsics_3']],
[0,0,1],
]
)
distortion_coeffs = np.array(
[
_raw_camera['intrinsics_4'], _raw_camera['intrinsics_5'], _raw_camera['intrinsics_6'], _raw_camera['intrinsics_7']
]
)
return distortion_coeffs, intrinsics |
@miguelmartin75 @suyogduttjain , I use the functions you provided above (undistort_exocam, get_distortion_and_intrinsics, etc) to get undistorted image as well as corresponding intrinsics. I use it to project provided 3d body pose and compare it with provided 2d body pose. Although it aligns much better than before, there is still some misalignment. See below where red points show groundtruth 2d pose and green points show projected 3d pose. However, according to this function, the alignment between provided 2d pose and projected 3d pose should be exact. What could be wrong? One possibility is that As @isarandi mentioned before, it would be great if you can provide a working example of projecting 3d pose and ensuring alignment with 2d pose. |
Hi, We have created a notebook tutorial to show how to un-distort and overlay annotations on them. Link: https://github.com/facebookresearch/Ego4d/blob/main/notebooks/egoexo/Ego-Exo4D_EgoPose_Tutorial.ipynb Regarding matching projected 3D and 2d pose, are you looking to match projected ones with human annotated ground truth 2D pose? Or this is specifically asking about automatic ground truth? |
Thanks @suyogduttjain for the notebook reference. Yes, I am looking to match the projected 3d pose with annotated 2d pose. Does it mean that the automatic 2d pose groundtruth will match the projected 3d pose but the human annotated 2d pose ground truth may not match the projection because of inconsistencies and triangulation error? |
For automatic pose ground truth, the 2D points were generated first and then 3D triangulation was done based on those 2D points using camera parameters. Over the course of dataset building the camera parameters changed several times due to improved localization algorithms hence we updated 3D poses whenever that happened but the 2D points remain the same. Hence the 2D points should not be treated as projections. Same holds for manual ground truth except in that case 2D points were further corrected by humans during the annotation process. Hope this helps. |
@suyogduttjain, thanks for providing these details. I understand that I should not expect 2d pose groundtruth (automatic or manual) to perfectly match the 3d pose groundtruth. This really helps! |
Thanks for your great work and effort! The notebook is very helpful to get correctly aligned keypoints in undistorted image space. But I got stuck when trying to do the inverse, i.e. take the annotated 2d joints (undistorted image space) and map them back to the original image (distorted image space). Inverting the maps together with remap works perfectly fine for an image(input values to initInverseRectificationMap() are taken from undistort_exocam() in the sample notebook):
Did somebody experience the same problem? I would appreciate your help. |
It seems that the human poses are not correct.
I'm simply plotting the 2D annotations as released.
This is take uid eba56e4f-7ec8-4d47-9380-e69928323e94 (iiith_cooking_111_2)
I've tried a 2-3 other takes, too, and none seem correct.
Here is unc_basketball_03-31-23_02_14 cef2f19f-ec48-410c-8205-4572cc8706d9 frame 39
This visualization is directly of the 2D coordinates, I'm not projecting the 3D coordinates to the image by myself here (although that also results in misaligned projections).
Is this known?
Furthermore, when I estimate 2D poses on my own, per camera, and try to triangulate it myself using the given camera calibration parameters, it doesn't really work (the reprojection error is high), which leads me to suspect that the camera calibration may have issues.
The text was updated successfully, but these errors were encountered: