-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for NFFTs of real valued Arrays #86
Comments
Hello @roflmaostc, it wasn't a priority for me since MRI data is always complex. Today I'm unable to implement it due to time constraints, but I could possibly consider a PR if someone else did it. |
I can try to help since I would love that feature! Maybe specified with another parameter? What is requried? Would that be here?
There is probably more steps to do with the interpolation? |
I think the interpolator could be a little bit involved. First you'd want to check that it has no imaginary values (I don't remember). Then there might be a question of reflecting the results to the Hermitian-symmetric part of Fourier space so that the k-nearest neighbors algorithm works correctly. After doing these things I'm actually guessing there would be no speedup, because the interpolator is the bottleneck anyway. Maybe a deeper look into the NUFFT algorithm to consider real inputs would be necessary to get an efficient implementation. I don't think it's as simple as changing the forward FFT op. |
What is the estimated overhead in your case? I observed the following. Does that match with your experiences? So NUFFT along the last two dims, and a leading batch dim. voxels = torch.zeros(100,1, 768,768, device="cuda")+ 0j
# 300 angles
nufft, adjnufft, ktraj = prepare_nufft(voxels, 300, numpoints=3, gridfactor=1.25, device="cuda")
%%timeit
torch.fft.fft2(voxels +0j)
torch.cuda.synchronize()
# 9ms
%%timeit
nufft(voxels + 0j, ktraj)
torch.cuda.synchronize()
# 61ms In that case, a RFFT really would only save ~4 to 5ms. (maybe a little more because of 1.25 padding) |
85% overhead seems a little high but could be possible for some trajectories. It's actually trajectory-dependent. If you only have one radial spoke then it's possible the FFT takes longer. |
I actually used 300 radial spokes of length 768 each. And a batch size of 100 maybe, in max. What should be the shape of ktraj in this case? Tbh, in the documentation I was confused about coils, etc. Should I rearrange the spokes along another dim for more parallelization? |
Hi,
as far as I can see, there is no support for real-valued input arrays?
However, with
rfft
we could save a factor of 2 on computation for those cases by usingrfft
.Are there some issues with that?
Best,
Felix
The text was updated successfully, but these errors were encountered: