Releases: milakov/nnForge
Releases · milakov/nnForge
v1.1.10
- You can now specify zero-padding for input data for convolutional layers
- Memory usage calculations improved
- Learning rates is per part now (was per parameter) - training consumes less memory, bigger networks might be trained
- Dropout implementation is simplified
- Minor fixes
v1.1.9
- more sparse cases supported in GPU backend, improved perf
- convert_data_type_transformer added
- Hessian based learning algo is removed
- Galaxy Zoo example removed. Use previous releases to get it
- Reporting average weights/updates after each batch
- Image classifier demo added, improved perf for running single entry through the tester
v1.1.8
- Sparse (in feature map dimension) convolutional layer added, with full support in CPU backend and fully connected (spatial) 1x1 support in GPU backend
- You can use -std=c++11 now with CUDA 6.5
- Gradient check added
- GTSRB switched to batch training
- Boost and OpenCV libs default paths are /usr now
- Improved performance for 1x1 convolutions in GPU backend
- Minor fixes
v1.1.7
- Mini-batches added
- Weight decay added
- Momentum added
- Cross Entropy error function is renamed to Negative Losss Likelihood, true Cross Entropy added
- Sigmoid layer added, with correct biases initialization for the classifier
- Splitting single epoch into multiple epochs through epoch_count_in_training_set parameter
- max_subsampling layer supports 1D and 4D in GPU backend (was 2D and 3D only)
- rotate_band_data_transformer is extended to all dimensions (was 2D only)
- extract_data_transformer extended to data of any dimension in case input and output windows match
- snapshot_data: added scaling and 3D (video)
- Sigmoid+Coss-entropy and Softmax+Negative-log-likelihood fusion implemented in CPU and GPU backends to increase accuracy
- Max L2 bound on incoming weights implementation is dropped
- Conversion to bw image fixed in GTSRB example
- max subsampling updater and hessian - corner cases fixed in CPU backend
v1.1.6
- Stochastic Gradien Descent training method is added
- Resume training fuctionality added
- Duplicating output to log file
- Logging current settings at the toolset initialization
- rgb_to_yuv_convert_layer_tester added in CPU backend
- Readers are redesign to allow variable data readers
- classifier_result is extended to top-N
- Added possibility yo split single reader into multiple epochs
- Multiple fixes
v1.1.5
- Performance of weight updaters for convolutional layers is improved in CUDA backend
- Convolutional 1D, 2D, 3D, and 4D layers are fully supported by CUDA backend
- Fixed training multiple networks with CPU backend
- Fixed supervised_data_mem_reader for float data
v1.1.4
- C++11 limited support added: you can build everything except for CUDA backend - this is due to NVCC not yet supporting C++11
- Improved testing and validating (feed forward) performance of convolutional layers in CUDA backend for Kepler at the same time greatly simplifying the code. For example, for Galaxy Zoo example, The performance of convolutional layers alone increased from 1.15 TFLOPs to 2 TFLOPs on GeForce GTX Titan, which resulted in whole network performance increase from 1 TFLOPs to 1.55 TFLOPs during validation and testing
- Improved performance of max subsampling 2d tester for CUDA backend. The implementation is far from optimal yet
v1.1.3
- Snapshot functionality is redesigned fully - it is now doing backpropagation, the feature is still in beta
- Ability to define custom error functions is added
- Cross-entropy error function is added, use with care - not tested yet
- Galaxy Zoo example added
- cuda_max_global_memory_usage_ratio is set to 0.8 by default - This should help those running code on a primary videocard
- per_layer_mu mode is added - More robust training in some cases
- Fixes:
- Fixed crash when using output transformer
- Fixed backprop for local_contrast_subtractive_2d_updater in CUDA backend
- Fixed build with Boost 1.55
v1.1.2
- Deterministic transformator added for testing and validating
- snapshots are made on ANNs from batch directory
- Toolset parameters changed:
- learning_rate_decay_rate is exposed as a command line parameter
- training_speed parameter renamed to learning_rate, training_speed_degradation is dropped
- training_iteration_count renamed to training_epoch_count
- train command does batch train, batch_train command is removed
- validate and test now work in batch mode, validate_batch and test_batch removed
- mu_increase_factor is set to 1.0 by default
- max_mu set to 1.0 by default
- Bug-fixes
v1.1.1
- Using space-filling curve for all the convolutional updaters, testers and hessians in CUDA backend, training large networks performance improved
- Improved concurrent training and loading/processing input data for all the stages by loading data in a separate host thread, CUDA backend only
- In-memory supervised data reader added
- Added NVTX profiling for reading input data, CUDA backend only
- Fixed:
- Binding texture to too large linear buffer
- Average subsampling backprop in CUDA backend is wrong for non-even configs
- Fixed performance in Windws with WDDM driver