Skip to content

v1.1.1

Compare
Choose a tag to compare
@milakov milakov released this 11 Jan 16:03
· 343 commits to master since this release
  • Using space-filling curve for all the convolutional updaters, testers and hessians in CUDA backend, training large networks performance improved
  • Improved concurrent training and loading/processing input data for all the stages by loading data in a separate host thread, CUDA backend only
  • In-memory supervised data reader added
  • Added NVTX profiling for reading input data, CUDA backend only
  • Fixed:
    • Binding texture to too large linear buffer
    • Average subsampling backprop in CUDA backend is wrong for non-even configs
    • Fixed performance in Windws with WDDM driver