Skip to content

adwardlee/multitask-end-to-end-video-captioning

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

End-to-End Video Captioning with Multitask Reinforcement Learning


This repo holds the codes and models for the end-to-end captioning method presented on WACV 2019

End-to-End Video Captioning with Multitask Reinforcement Learning

Lijun Li, Boqing Gong

[Arxiv Preprint]

If you use our code, please cite our paper.

Prerequisites

use the following to clone to your local machine

git clone https://github.com/adwardlee/multitask-end-to-end-video-captioning.git

Download Datasets

We support experimenting with two publicly available datasets for video captioning: MSVD & MSR-VTT.

Preprocess data

Extract all frames from videos

It needs to extract the frames by using cpu_extract.py. Then use read_certrain_number_frame.py to uniformly sample 5 frames from all frames of a video. At last use the tf_feature_extract.py and modify the model path to extract the inception-resnet-v2 features of frame.

Training from scratch

use the *_s2vt.py. Before that, it needs to change the model path of evaluation function and some global parameters in the file. For example,

Step 1

python tf_s2vt.py --gpu 0 --task train

Step 2

Using the pretrained model from step 1 and then

python reinforcement_multisampling_tf_s2vt.py --task train

Step 3

Using the pretrained model from step 2 and then

python reinforce_multitask_e2e_attribute_s2vt.py --task train

Testing existing models

Evaluate models

use the *_s2vt.py. Before that, it needs to change the model path of evaluation function and some global parameters in the file. For example,

python tf_s2vt.py --gpu 0 --task evaluate

for testing the pretrained models, please refer to Repo

The MSVD models can be downloaded from here The MSR-VTT models can be downloaded from here

we also apply temporal attention in tensorflow

About

with reinforcement learning

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages