-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The inference time is slower than that reported in the paper #9
Comments
And I also test the synchronized time on 2080Ti. input_img = input_img.to(device)
torch.cuda.synchronize()
tm = time.time()
pred = model(input_img)[2]
torch.cuda.synchronize()
elapsed = time.time() - tm
adder(elapsed)
pred_clip = torch.clamp(pred, 0, 1) For MIMO-UNet:
And for MIMO-UNet+
This result is consistent with the performance gap between 2080ti and 3090. But I am still confused about the performance on Titan XP. |
I also test MT-RNN and MPRNet on the same 2080Ti PC. For MT-RNN, the asynchronous inference time and synchronized time is 46ms and 480ms, respectively. The time reported in MT-RNN paper is 0.07s on Titan V. But what confused me is that I got longer asynchronous time consumption (15ms/30ms on 2080Ti) with your results (8ms/17ms on Titan XP) reported in your paper. |
Thank you for your interest in our work. The inference time reported in the manuscript was measured in the following HW/SW environments, and the log file for this experiment can be found at the following link. Hardware: TITAN XP(GPU), intel i5-8400 (CPU) Please note that depending on the version of Pytorch or Cuda, the change in inference time may be different for each network as discussed in CUDA, Pytorch. Best, |
I can't view the log directly. Not for your paper, but for the whole community of image deblurring. What is your opinion on which time should be reported in an academic paper? I think the unsynchronized time reported in existing methods will cause misunderstanding, especially the time less than 30ms which meets the real-time requirements. In fact, as can be seen from the above experiments, these model can only run less than 5 FPS, and can not be applied to practical applications with real-time requirements. I will also raise this issue to others. |
I have test the MIMO-UNet and MIMO-UNet+ on a single 2080Ti card (Theoretical performance is higher than TitanXp), which takes about 15ms and 30ms. I didn't make any changes to the open source code, just run the test command (https://github.com/chosj95/MIMO-UNet#test) directly.
For MIMO-UNet:
And for MIMO-UNet+
In addition, I think the CUDA synchronized time should be used when reporting the time performance. The unsynchronized time can not correctly measure the speed and complexity of the model.
The text was updated successfully, but these errors were encountered: