DeepLearningFinalProject

Coco2014 image captioning with visual attention

Architecture

Almost the same as paper in show attend and tell.
Image resize to 224x224 no crop.
CNN: resnet50 pretrained on imagenet (small finetuning) output feature map is 2048x7x7
LSTM: Hidden size: 512
Word Embeddings size pretrained on Glove(final experiment): 300
Attention Scoring function: Bahdanau Attention Vt*tanh(W1Q+W2K)
For faster convergence these tricks are used:
Multisample dropout used in final fc layer as well as on attention layers.
OneCyleLr to train with high learning rates

File Explanation

config.py

default values and directories if arguments for training/build_vocab not specified.
You are recommended to use the default configuration values and not specify any arguments in the other files

build_vocab.py

builds the vocabulary object and pickle dumps it for later use.

model.py

contains CNN encoder and LSTM decoder with attention

model_beam_inference

used for inference for various beam sizes

gui.py

starts Flask server for frontend visualization

eval.py

used for evaluation of final model

How to train

Ensure that you are in the scratch folder.
The code below dumps binary vocab dictionary.
NOTE we train our models with a vocab threshold of 20. i.e. Only words with 20 or more occurences are added to the vocabuluary. Using any other number would not work when running our trained models.

git clone https://github.com/garyongguanjie/DeepLearningFinalProject.git
cd DeepLearningFinalProject
python build_vocab.py --train_json_path TRAIN_JSON_PATH --vocab_path VOCAB_PATH --vocab_threshold 20

Download Glove embeddings

wget -c "https://sutdapac-my.sharepoint.com/:u:/g/personal/gary_ong_mymail_sutd_edu_sg/EYRfz8CJmNFEqW9FtA6dTRABzhWQuTNubo6j_mzfKL1tEg?e=Tdm2eb&download=1" -O glove_embeddings.pth

To train

python train.py --vocab_path VOCAB_PATH --train_img_path TRAIN_IMG_PATH --val_img_path VAL_IMG_PATH --train_json_path TRAIN_JSON_PATH --val_json_path VAL_JSON_PATH --glove_embed_path GLOVE_EMBED_PATH

Final Weights

5 epochs
Optimizer:
Encoder SGD momentum=0.9 weight_decay=5e-5
Decoder SGD momentum=0.9,weight_decay=1e-7
Scheduler
Encoder OneCycleLR initial lr 0.0002 maxlr = 0.005
Decoder OneCycleLr initial lr 0.004 maxlr = 0.1
Bleu4 score:24.71
Download final weights

wget -c "https://sutdapac-my.sharepoint.com/:u:/g/personal/gary_ong_mymail_sutd_edu_sg/EUpurGS1mXxAg38s8lkAUb8BF80pzSB_Su6TQ6cbCBYXxw?e=SqWtLN&download=1" -O final_weights.zip

Weights with glove embeddings finetuned

5 epochs
Optimizer:
Encoder SGD momentum=0.9 weight_decay=5e-5
Decoder SGD momentum=0.9,weight_decay=1e-7
Scheduler
Encoder OneCycleLR initial lr 0.0002 maxlr = 0.005
Decoder OneCycleLr initial lr 0.004 maxlr = 0.1
Bleu4 score:24.49
Download final weights

wget -c "https://sutdapac-my.sharepoint.com/:u:/g/personal/gary_ong_mymail_sutd_edu_sg/EZX0t0yLy4tOh0tteU5HuFMBUyu7rPe-EdxL-irWGDzn6w?e=RXjhOb&download=1" -O weights_3.zip

First try Weights

3 epochs
Optimizer:
Encoder SGD momentum=0.9 weight_decay=5e-4
Decoder SGD momentum=0.9,weight_decay=1e-6
Scheduler
Encoder OneCycleLR initial lr 0.0002 maxlr = 0.005
Decoder OneCycleLr initial lr 0.004 maxlr = 0.1
Bleu4 score:24.46
Download weights from first attempt

wget -c "https://sutdapac-my.sharepoint.com/:u:/g/personal/gary_ong_mymail_sutd_edu_sg/EUE8VQN6j7dNrRyhPLoCVFkBXYyRoQgcicrRQM_PhxYslg?e=xS0idk&download=1" -O weights.zip
unzip weights.zip

Deploy GUI

Ensure final weights are downloaded into ./weights folder and binary vocab dictionary is built into ./data folder
Ensure required libraries are installed

python gui.py

Name		Name	Last commit message	Last commit date
Latest commit History 74 Commits
static/styles		static/styles
templates		templates
.gitignore		.gitignore
MyCocoDataset.py		MyCocoDataset.py
README.md		README.md
Untitled.ipynb		Untitled.ipynb
Untitled1.ipynb		Untitled1.ipynb
Untitled2.ipynb		Untitled2.ipynb
build_vocab.py		build_vocab.py
config.py		config.py
data_loader.py		data_loader.py
eval.py		eval.py
graph.png		graph.png
gui.py		gui.py
make_embeddings.py		make_embeddings.py
model.py		model.py
model_beam_inference.py		model_beam_inference.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepLearningFinalProject

Architecture

File Explanation

How to train

Final Weights

Weights with glove embeddings finetuned

First try Weights

Deploy GUI

About

Releases

Packages

Contributors 4

Languages

garyongguanjie/DeepLearningFinalProject

Folders and files

Latest commit

History

Repository files navigation

DeepLearningFinalProject

Architecture

File Explanation

How to train

Final Weights

Weights with glove embeddings finetuned

First try Weights

Deploy GUI

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages