Goal:
Problem Statement - Using the first 11 frames of a video predict the segmentation mask of the last (22nd) frame.
the following command can be used to create the conda environment and install all necessary dependencies:
conda env create -f environment.yml
source activate projectDL
Generative Adversarial NEtwork (GAN) with a ConvLSTM generator and a simple linear discriminator was used for future frame prediction.
To run the pipeline and train the generator network to predict the 22nd frame, the following command can be executed -
python frame_pred/src/main_hpc.py --cfg=config_hpc.json
Segmentation is performed using a U-Net model.
to run the pipeline and generate and train the segmentation model, the following command can be executed -
python segmentation/segmentation.py
to connect both the future frame prediction model and the Segmentation model in order to generate the masks of the 22nd frame of a video (given only 11 frames as input), the following command can be executed (after updating the path to the dataset on line 440 in infer.py) :
python infer.py
This project was part of the coursework for the Deep Learning (Spring 2023) course at NYU.
Contributors:
- Anoushka Gupta ([email protected])
- Charvi Gupta ([email protected])
- Anisha Bhatnagar ([email protected])