- PyTorch implementation of the paper StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks by Han Zhang, Tao Xu, Hongsheng Li, Shaoting Zhang, Xiaogang Wang, Xiaolei Huang, Dimitris Metaxas.
- We use BERT embeddings for the text description instead of the char-CNN-RNN text embeddings that were used in the paper implementation.
- Stage 1 trained using BERT embeddings instead of the orignal char-CNN-RNN text embeddings
- Stage 2 trained using BERT embeddings instead of the orignal char-CNN-RNN text embeddings
🐦 Examples for birds (char-CNN-RNN embeddings), more on youtube:
🌻 Examples for flowers (char-CNN-RNN embeddings), more on youtube:
git clone https://github.com/sahilkhose/StackGAN-BERT.git
pip3 install -r requirements.txt
Check instructions in /input/README.md
cd input/src
python3 data.py
Change the DEVICE to cpu
in input/src/config.py
if cuda
is not available
python3 bert_emb.py
cd ../../src
Option 1: CLI args training src/args.py
python3 train.py --TRAIN_MAX_EPOCH 10
Option 2: yaml args training cfg/s1.yml
and cfg/s2.yml
python3 train.py --conf ../cfg/s1.yml
mkdir ../old_outputs
mv ../output ../old_outputs/output_stage-1
python3 train.py --conf ../cfg/s2.yml
mv ../output ../old_outputs/output_stage-2
To load the tensorboard
tensorboard --logdir=../output
If you find StackGAN useful in your research, please consider citing:
@inproceedings{han2017stackgan,
Author = {Han Zhang and Tao Xu and Hongsheng Li and Shaoting Zhang and Xiaogang Wang and Xiaolei Huang and Dimitris Metaxas},
Title = {StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks},
Year = {2017},
booktitle = {{ICCV}},
}
Follow-up work
- StackGAN++: Realistic Image Synthesis with Stacked Generative Adversarial Networks
- AttnGAN: Fine-Grained Text to Image Generation with Attentional Generative Adversarial Networks [supplementary] [code]
References