-
Notifications
You must be signed in to change notification settings - Fork 305
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compute density compensation for screen space blurring of tiny gaussians #117
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm fine with merging this PR but I want to let @vye16 to decide if we need backward compatibility or not.
|
||
- **xys** (Tensor): x,y locations of 2D gaussian projections. | ||
- **depths** (Tensor): z depth of gaussians. | ||
- **radii** (Tensor): radii of 2D gaussian projections. | ||
- **conics** (Tensor): conic parameters for 2D gaussian. | ||
- **compensation** (Tensor): the density compensation for blurring 2D kernel |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This extra return would break backward compatibility. Personally I'm fine with it as we are in active-developing version 0.1.x
. But I'll let @vye16 to decide starting from when we want to maintain backward compatibility.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @vye16 , could you help take a look at this PR and see if you have any other comments other than @liruilong940607
@@ -88,6 +92,7 @@ __global__ void project_gaussians_forward_kernel( | |||
depths[idx] = p_view.z; | |||
radii[idx] = (int)radius; | |||
xys[idx] = center; | |||
compensation[idx] = comp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Read/Write the global memory is usually the most time consuming part in a kernel (computation is usually not the burden). I tested this a bit and it slows down the project_gaussians
from 3000 it/s to 2800 it/s which is not that much so I think is fine. Especially that project_gaussians
is much cheaper comparing to the rasterization
stage. I'm fine with this tiny little extra burden but just want to point it out for future reference.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The code I used to test this.
import torch
def profiling(N: int = 1000000, D: int = 3):
import tqdm
from gsplat import project_gaussians, rasterize_gaussians
torch.manual_seed(42)
device = torch.device("cuda:0")
means3d = torch.rand((N, 3), device=device, requires_grad=False)
scales = torch.rand((N, 3), device=device) * 5
quats = torch.randn((N, 4), device=device)
quats /= torch.linalg.norm(quats, dim=-1, keepdim=True)
viewmat = projmat = torch.eye(4, device=device)
fx = fy = 3.0
H, W = 256, 256
BLOCK_X = BLOCK_Y = 16
tile_bounds = (W + BLOCK_X - 1) // BLOCK_X, (H + BLOCK_Y - 1) // BLOCK_Y, 1
pbar = tqdm.trange(10000)
for _ in pbar:
xys, depths, radii, conics, num_tiles_hit, cov3d = project_gaussians(
means3d,
scales,
1,
quats,
viewmat,
projmat,
fx,
fy,
W / 2,
H / 2,
H,
W,
tile_bounds,
)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the evaluation effort.
Needs to run the formatter |
I used a newer version of black. Now updated black with the same version used by devops and re-run the formatter. |
Initial discovery of the issue: Example showing the effect of the fix: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me, aside from this backward compatibility issue. How much faster is it do this in the kernel than with the 2d covariance returned in the original projection function? If it's much faster, I'm happy to break backward compatibility.
Why
Current gaussian splatting (both Inria and nerfstudio) doesn't sufficiently deal with the case when rendering tiny gaussians at substantial different resolution than the captured one. The reason is caused by the 0.3 pix kernel screen space blurring applied to the tiny splats.
A tiny splat that has been enlarged by 0.3 pixel can block more gaussians behind it if rendering at lower resolution than the captured one, or looks much thinner than it should be if rendering at higher resolution than the captured one. One would easily observe this artifacts by changing the distance or resolution of rendering.
This issue is acknowledged by the original author of Gaussian splatting, and was addressed in another open source repo.
Addressing this issue has significant benefit in an interactive webviewer, where one zoom in/out to inspect a splat model, and see much less artifacts then before. Another simple quantitative experiment one can do is to train at 1/2 resolution and evaluation at the original resolution, addressing the issue expects to improve metrics.
Solution
The solution is rather simple. We will compute a compensation factor$\rho=\sqrt{\dfrac{Det(\Sigma)}{Det(\Sigma+ 0.3 I)}}$ for each splat and multiply it to the opacity of gaussian before rasterization. The same solution was also proposed in another research paper.
The PR will basically modify the output of rasterize_gaussians to return the compensation factor$\rho$ and the change allows us to have two modes in nerfstudio's splatfacto: the classic mode which mimic the behavior of official GS, and the new mode which address the "aliasing-like" issue.
See results here (I had to upload the video to my own forked repo's issue due to file size limit).