An EfficientDet implementation in TF2.0 based on the paper EfficientDet: Scalable and Efficient Object Detection on CVPR'20.
Our implement is under \ours
directory.
如何使用Keras fit和fit_generator(动手教程)
How to Train Your Keras Model (Dramatically Faster)
Distributed training with TensorFlow
- --weighted-bifpn: Use Weighted BiFPN
- --snapshot: specify checkpoint or pretrained model, you can use imagenet as pretrained backbone
- --phi: determine which model to use, you can choose 0(EfficientDet D0) ~ 6(EfficientDet D6)
- --freeze-backbone: Fix backbone weight
- --steps: determine the number of steps in a epoch
- --epochs: training epochs
- pascal: use Pascal VOC dataset format and give dataset path to VOC2007 directory
For example, train EfficicentDet D0 with ImageNet pretrained backbone and Weighted BiFPN
python3 train.py --snapshot imagenet --phi 1 --weighted-bifpn --gpu 3 --random-transform --compute-val-loss --freeze-backbone --batch-size 32 --steps 150 --epochs 100 pascal /opt/shared-disk2/sychou/comp2/VOCdevkit/VOC2007/
因為檔案路徑(import utils)的關係,必須把Testing的檔案eval/common.py移到上一層資料夾,方可執行使用。並且注意檔案內以下幾個參數
- phi: 選擇EfficientDet的backbone,必須和training時python3 train.py --phi參數選擇的相同
- weighted_bifpn: 使否使用weighted BiFPN
- PascalVocGenerator: 填上testing dataset的path,例如: '/opt/shared-disk2/sychou/comp2/VOCdevkit/test/VOC2007'
- model_path: 填上訓練好的模型weights,例如: '/home/ccchen/sychou/comp2/comp2/efficent_series/EfficientDet/old_checkpoint/pascal_45_0.3772_0.3917.h5'
BUJO+ Environment:
- GPU: 2080Ti 10986MB
- Python 3.6.8 :: Anaconda, Inc.
- Tensorflow: 2.2.0, with GPU support
- Cuda compilation tools, release 10.0, V10.0.130
Here is the trained checkpoint.
- 將 efficientdet.py 的 model_path 改成自己的 model,並將 phi 改成正確的 phi
- predict.py 裡面有兩個段落,上面是可以偵測圖片的,下面是生成 test_prediction.txt
- 執行 predict.py 生成 test_prediction.txt
- 確保 txt2csv.py 跟 test_prediction.txt 在同個資料夾下,執行 txt2csv.py 產生最終的 output.csv
常見問題 :
- 一般預測用 predict.py 的 51 行,comp2 用 predict.py 的 52 行
- 轉檔部分請用 txt2csv.py 否則容易發生各種難以預料的環境問題 (切勿直接使用 evaluate.py 生成答案)
- command:
python3 train.py --snapshot imagenet --phi 1 --weighted-bifpn --gpu 3 --random-transform --compute-val-loss --freeze-backbone --batch-size 32 --steps 150 --epochs 100 pascal /opt/shared-disk2/sychou/comp2/VOCdevkit/VOC2007/
- Backbone: EfficeintNet B1, phi 1
The final result(Epoch 100) of the first time training. Training: mAp 0.97, Testing: mAp 0.74
The last two result(Epoch 99) of the first time training. mAp 0.97 However, I forget to split train and test into different folders. It probably train on test dataset. It perhaps overfits.
-
Phi 6: Ran out of memory with batch size 8
-
Phi 5: Ran out of memory with batch size 32
-
Phi 3: Ran out of memory with batch size 16, start training successfully with batch size 8
-
command:
python3 train.py --snapshot imagenet --phi 3 --weighted-bifpn --gpu 3 --random-transform --compute-val-loss --freeze-backbone --batch-size 8 --steps 150 --epochs 100 pascal /opt/shared-disk2/sychou/comp2/VOCdevkit/trainval/VOC2007/
- Backbone: EfficeintNet B3, phi 3
Epoch 45, mAp 0.73. Train 7.5 hours
Epoch 100, 0.9399, Train 15 hours
Testing mAP: 0.7704
- phi 2: Just train 44 epoch
Epoch 45, batch size 16, Train 9 hours, Kaggle score 0.79725
- phi 1: Train 100 epoch(50 epoch train, 50 epoch fine tune)
Train 100 epoch(50 epoch train, 50 epoch fine tune)
50 epoch train batch size 32
50 epoch fine tune batch size 4
Train 16 hours
Train 85 epoch(No fine tune), batch size 8
Train 12 hours
Train 176 epoch(050 epoch training, 50100 epoch fine tune, 100~176 epoch training), batch size 24
phi3-Epoch85-Total_Loss0.3117-Val_Loss0.3696.h5