Humpback whale re-identification using Siamese neural nets

Code for 5th place winning solution for Humpback Whale Identification contest

Requirements

needs system packages python-dev and libvips

Hardware: GPU NVIDIA 1080 Ti
Software: Python 3.6, keras==2.2.4, keras-retinanet==0.5.0, albumentations, pyvips, scipy, numpy, pandas, tqdm, lap, sklearn, pyvips, tensorflow

Links

Input data location

Both training and test images should be put inside below folder separately:

../data/train/
../data/test/

train.csv and sample_submission.csv are at below locations:

../data/train.csv
../data/sample_submission.csv

Part 1 - Bounding box models

Requires: ../modified_data/p2bb_v5.pkl
Requires: ../modified_data/retinanet/cropping_train_v2.csv - some boxes for playground competition

cd code

PYTHONPATH="$PWD" python3 retinanet/r10_create_csv_for_retinanet.py
PYTHONPATH="$PWD" python3 retinanet/r30_train_backbone_resnet152_kfold.py
PYTHONPATH="$PWD" python3 retinanet/r31_convert_retinanet_model.py
PYTHONPATH="$PWD" python3 retinanet/r31_get_vectors_backbone_resnet152_kfold.py
PYTHONPATH="$PWD" python3 retinanet/r32_average_boxes.py

As result we obtain following files:

../modified_data/p2bb_averaged_v1.pkl - boxes for train/test images
../modified_data/p2bb_averaged_playground_v1.pkl - boxes for playground images

Part 2 - Siamese Nets with DenseNet121 and SE-ResNext50

Generate KFold splits

python3 r10_create_kfold_split.py

As result we have 2 files with different KFold splits

../modified_data/kfold/new_4_folds_split_train_val_v1.pkl - kfold split v1 (used by DenseNet121)
../modified_data/kfold/new_4_folds_split_train_val_v2.pkl - kfold split v2 (used by SE-ResNext50)

Part with siamese nets (DenseNet121)

python3 siamese_net_v5_densenet121/r10_seamese_net_warmstart_from_scratch_224px.py
python3 siamese_net_v5_densenet121/r11_seamese_net_warmstart_finetune_384px.py
python3 siamese_net_v5_densenet121/r15_seamese_net_train_v5_finetune_384px.py 0
python3 siamese_net_v5_densenet121/r15_seamese_net_train_v5_finetune_384px.py 1
python3 siamese_net_v5_densenet121/r15_seamese_net_train_v5_finetune_384px.py 2
python3 siamese_net_v5_densenet121/r15_seamese_net_train_v5_finetune_384px.py 3
python3 siamese_net_v5_densenet121/r15_seamese_net_train_v5_finetune_512px.py 0
python3 siamese_net_v5_densenet121/r15_seamese_net_train_v5_finetune_512px.py 1
python3 siamese_net_v5_densenet121/r15_seamese_net_train_v5_finetune_512px.py 2
python3 siamese_net_v5_densenet121/r15_seamese_net_train_v5_finetune_512px.py 3
python3 siamese_net_v5_densenet121/r26_seamese_net_inference_v5_512px.py

Part with siamese nets (SE-ResNext50)

python3 siamese_net_v6_se_resnext/r11_seamese_net_warmstart_from_scratch_224px.py
python3 siamese_net_v6_se_resnext/r12_seamese_net_warmstart_from_scratch_384px.py
python3 siamese_net_v6_se_resnext/r15_seamese_net_train_v6_finetune_384px.py 0
python3 siamese_net_v6_se_resnext/r15_seamese_net_train_v6_finetune_384px.py 1
python3 siamese_net_v6_se_resnext/r15_seamese_net_train_v6_finetune_384px.py 2
python3 siamese_net_v6_se_resnext/r15_seamese_net_train_v6_finetune_384px.py 3
python3 siamese_net_v6_se_resnext/r16_seamese_net_inference_v6_384px.py

Create tables for using models predictions in ensemble

python3 r20_prepare_matrices_for_ensemble.py

As result we will have 4 files with prediction matrices, which will be used for ensemble

../features/cv-analysis-fs14-LB959-densenet121-512px-sparse.pkl
../features/cv-analysis-fs14-LB959-densenet121-512px-sparse-test.pkl
../features/cv-analysis-fs16-LB959-seresnext50-384px-sparse.pkl
../features/cv-analysis-fs16-LB959-seresnext50-384px-sparse-test.pkl

Part 3 - Siamese Nets with customized ConvNets model

Create kfold splits

python kfold_splits_for_kernel.py

Train Siamese Nets with k-fold approach

Train four-fold siamese nets, and each training requires two GPUs. Make sure you have enough GPUs (8) to run all four model training parallelly. Otherwise, run in sequence four times

python snn_train_kernel_384_to_1024.py --CUDA_VISIBLE_DEVICES 0,1 --RUN_FOLD 0
python snn_train_kernel_384_to_1024.py --CUDA_VISIBLE_DEVICES 2,3 --RUN_FOLD 1
python snn_train_kernel_384_to_1024.py --CUDA_VISIBLE_DEVICES 4,5 --RUN_FOLD 2
python snn_train_kernel_384_to_1024.py --CUDA_VISIBLE_DEVICES 6,7 --RUN_FOLD 3

Create inference for customized ConvNets siamese nets

Once above trainings are done, find out the best saved weights from each model based on log, and run inference below to generate the final averaged test-vs-train score matrix

python snn_inference_kernel_1024.py --model_weights_1 ../path_to_your_best_weights_1 --model_weights_2 ../path_to_your_best_weights_2 --model_weights_3 ../path_to_your_best_weights_3 --model_weights_4 ../path_to_your_best_weights_4

Part 4 - Ensemble all three models, and apply post processing steps

Check to make sure all three models are generated inside ../features/, then run:

python final_ensemble_with_post_proc.py

Final submit will be generated in:

../submission/final_submit_with_post_proc.csv

Name		Name	Last commit message	Last commit date
Latest commit History 19 Commits
code		code
features		features
models		models
modified_data		modified_data
.gitignore		.gitignore
Pipfile		Pipfile
Pipfile.lock		Pipfile.lock
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Humpback whale re-identification using Siamese neural nets

Requirements

Links

Input data location

Part 1 - Bounding box models

Part 2 - Siamese Nets with DenseNet121 and SE-ResNext50

Generate KFold splits

Part with siamese nets (DenseNet121)

Part with siamese nets (SE-ResNext50)

Create tables for using models predictions in ensemble

Part 3 - Siamese Nets with customized ConvNets model

Create kfold splits

Train Siamese Nets with k-fold approach

Create inference for customized ConvNets siamese nets

Part 4 - Ensemble all three models, and apply post processing steps

About

Releases

Packages

Languages

animalus/Humpback-whale-identification-challenge

Folders and files

Latest commit

History

Repository files navigation

Humpback whale re-identification using Siamese neural nets

Requirements

Links

Input data location

Part 1 - Bounding box models

Part 2 - Siamese Nets with DenseNet121 and SE-ResNext50

Generate KFold splits

Part with siamese nets (DenseNet121)

Part with siamese nets (SE-ResNext50)

Create tables for using models predictions in ensemble

Part 3 - Siamese Nets with customized ConvNets model

Create kfold splits

Train Siamese Nets with k-fold approach

Create inference for customized ConvNets siamese nets

Part 4 - Ensemble all three models, and apply post processing steps

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages