Training param tuning

Training param in prototxt

There has two types data augmentation method for different application

For example , I will choose adaptive aspect ratio in fisheye videos which were pixelwise geometry distortion

Convert channel in training and test phase , add parameter in prototxt

layer {
  name: "data"
  type: "AnnotatedData"
  ...
  transform_param {
    scale: ...
    cvt_bgr2rgb: true 
    ...
  }
  ...
}

This type may decrease accuracy when input size < 416 (compare with "Adaptive aspect ratio")

This type may break k-mean anchors rule and effect accuracy about ±1% in my test

If solver type set to "SGD" , you may need set learning rate policy like this

total_batch_size = iter_size * batch_size

If pre-trained weights use

Classification model (like imagenet)

total_batch_size set to 64 at least
Detection model (like ms-coco)

total_batch_size set to 16 at least ("PASCAL-VOC")

total_batch_size set to 32 at least , recommend to 64 ("MS-COCO")

You can try below training tricks

Decrease/Increase expand_param , batch sampler or jitter scale (data augmentation)
Set larger batch size
Set lr_mult:0.1 or 0.2 and also decay_mult at first 0 ~ x layers (x is 10 for MobileNet) , base on your backbone network
Changes solver type
Set rms_decay at range 0.9~0.98 in solver prototxt (rmsprop only)